Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
When Node.js is the wrong tool for the job (medium.com/jongleberry)
85 points by vmware505 on Jan 6, 2017 | hide | past | favorite | 95 comments


It seems like a lot of JavaScript developers are repeating things like "more people know JavaScript so you don't have to learn a new language, which saves you time". I don't get it. In my experience, if you find a good developer, they can pick up C#, Swift, or Go pretty quickly, and if you can't find a good developer, the fact that they already know JavaScript is not much of an advantage. Even if the developer you hire already knows your language, they're going to be spending time learning your code base and how your organization works (shared repo? PRs? Feature branches? Code review? Coding standards?)

That, and nose.js developers seem to repeat the claim that node.js makes delivery faster… but is it really any faster than ASP.NET, Rails, Django, or Go stdlib? Those frameworks are so fast for prototyping and delivering bread-and-butter apps as it is (and some of them let you do multithreading to boot).

I'm also really not interested in how things work for "typical CRUD apps" because those are so trivial to write in any decent environment.

I'm worried that node.js articles are the same kind of echo chamber that Rails articles were 10 ago.


The language doesn't matter, but the ecosystem does. It is when it comes to "non-CRUD apps" that an ecosystem will really start to matter. One is probably not better than the other but getting started is hard.

Take the case of Node and JavaScript itself. The ecosystem is huge. The tooling is complex. A newbie starting out will not even be able to get many examples to run without a mesh of transpilers and build tools. There are roadblocks everywhere; what's the type annotation there? What's with the strange JSX syntax? What's a transpiler? Add to that, EcmaScript itself is a moving target, more so than any other language.

Similar issues exist in other languages. C# is a simple language to master, .Net is a huge ecosystem. There's not going to be a lot of resistance moving from C# to F#. But for the newbie, the challenges are daunting. Should you use Microsoft's preferred ORM which ships with .Net or write your own SQL? Which UI tooling should I use? There is plenty of advice on the web, yet it needs an experienced developer to make those decisions.

Consider this. if you chose SilverLight a few years back, your product is soup today.

Picking an ecosystem is not easy. Has never been.


> Consider this. if you chose SilverLight a few years back, your product is soup today.

The amazing part about JS is that most of the code written for browsers a decade ago still works today. It is extremely unlikely that we will see a company outright ban JS in their web browsers in the same way that iDevices don't support flash.


The amazing part about JS is that most of the code written for browsers a decade ago still works today.

Yes, that is mostly true, but it glosses over two big problems with the JS ecosystem.

The first is that longevity seems to be a dirty word. I have seen professional JS developers say, without a hint of irony, that their code only needs to last a year, maybe two, so it doesn't need to be designed for long-term maintenance or ongoing development. That same attitude exists in much of the library and tool development, which means while your code might still run in 2 or 5 or 10 years, the dependencies for your code might not have kept up. Even if they still run, they're stuck at whatever level of features and security and flexibility they had when they were abandoned, which might well have been only six months or a year after you added them to your project. Now you have a perma-brake applied to everything you do.

The other big problem with the JS ecosystem is that JS itself is a moving target. Thanks to evergreen browsers, Node, and those controlling them, standardisation and stability also seem to be dirty words now. Your code from 10 years ago, using only basic JS language features, might still run. Unfortunately, your code from 2 years ago, using whatever alpha/beta quality JS features supported promises or observers or local storage at that time, might not. And the same goes for code in dependencies you introduced 2 years ago that are already abandonware.

Multiply those two factors and you have a recipe for instability and frequent unexpected failures unlike anything I have ever seen on any other platform.


Agreed. Adding to your point: If this were NASA, memories of Apollo 1 "go fever" would be a fitting analogy of our present circumstance.

"Nothing we did had any shelf life" was one of the core problems identified by Gene Kranz in his "tough and competent" speech to his mission control team after the fire. I think it's a salient point.

When all the parts are constantly churning, fires are to be expected. Progress is messy if we're not cautious, and I'm not sure if we are being cautious in the right places. We could learn from Mission Control. (Lots of room for debate, here! And I'm being purposefully vague.)


I will say that Go makes the default path pretty painless. If you can conform to the directory structure, you can build a binary for any platform from any platform with a single command--no metadata files or build scripts. From there, deployment is a matter of moving a single binary. No interpreter, no VM, no runtime dependencies. It's opinionated nature can be annoying, but it means a lot of choices are made for you. I'm not familiar with another language that offers such a fast path.


> I'm worried that node.js articles are the same kind of echo chamber that Rails articles were 10 ago.

I tend to agree.

> That, and nose.js developers seem to repeat the claim that node.js makes delivery faster…

The qualification was "if your team already knows JavaScript". OK... If the team already knows Java... building something like this in Java would be faster to deliver, and wouldn't hit some of those issues the author brought up (4 vs 1, single vs threaded, etc).

And the whole "oh, you know language X" - means almost nothing when the project is more than trivial hello world. Every project I've been brought in on, the qualifier was "must know tech X". And almost always (there were exceptions) I knew more about tech X and software dev in general than the original developers, and the crunch was not "how do I do XYZ in this tech?" it was "how do I get the other developers to actually use version control?" or "use version control sanely?" or "document anything?" or "write tests" or "have test data" or "have a repeatable build process"?

Knowing ASP or PHP or Ruby or whatever, but going in to a project without repeatable build process, tests, documentation, requirements or version control, is a recipe for disaster. Stressing the "deliver quickly" aspect of any language, if you're actually trying to deliver for a business or with a team of people, is extremely destructive short term thinking. And yes, sometimes, in rare cases, it may be a necessary evil, but I think it's become the norm with words like "agile" being thrown around as synonyms for "don't have to write anything down".


> Every project I've been brought in on, the qualifier was "must know tech X". And almost always (there were exceptions) I knew more about tech X and software dev in general than the original developers, and the crunch was not "how do I do XYZ in this tech?" it was "how do I get the other developers to actually use version control?" or "use version control sanely?" or "document anything?" or "write tests" or "have test data" or "have a repeatable build process"?

I've seen this so many times on consulting gigs and with the occasional new employer. I'm hired because I'm familiar with the technologies they're trying to use, but the first major value I bring to the organization is simply getting some kind of sane process: working version control, automatic builds, test suites, one-button deploys, and so on.


> I've seen this so many times on consulting gigs and with the occasional new employer.

Same, and it's getting really fatiguing. Probably the #1 driver for me to get a side-project off the ground is to finally actually write code and not just play janitor for someone else's mess.


> Stressing the "deliver quickly" aspect of any language, if you're actually trying to deliver for a business or with a team of people, is extremely destructive short term thinking. And yes, sometimes, in rare cases, it may be a necessary evil, but I think it's become the norm with words like "agile" being thrown around as synonyms for "don't have to write anything down".

Unfortunately, this is pervasive in the "real world", especially with regard to client-driven (agency) work. Deadlines beat developers almost without fail. I agree that it is destructive short term thinking. Entire weeks of programming can be destroyed in a 15-minute phone call that (re)highlights a limitation clients either didn't account for in their specification (if you can call it that) or just chose to ignore.

This isn't to say that using a language or framework a developer is familiar with is baseless, but I've heard this advice before and see it reinforced often:

1 - Hire people not skills.

2 - Success is 90% preparation.


As someone still studying and not in the workforce yet, are there really teams out there that don't use version control?


Yes, I used to do a fair amount of consulting (aka nothing works properly, fix it, have this bag of cash) and I frequently ran into 'projects' that had no source control or had <projectname>/1 <projectname>/2 <projectname>/2a_1_fix_foobar :|.

One of the things you'll learn if you hang around Hacker News/online programming communities is that a lot of the people in those communities are at the top end of the spectrum (there is a degree of self-selection that means a higher proportion I think).

It's pockets of relative sanity in a sea of disaster.


Yes. Also, some huge multinational companies you're very likely to have heard of don't do automated testing of any kind, as least for non-critical software (source: my own experience). I'm not talking about TDD or any fancy agile technique; I mean any kind of automated testing whatsoever.

If you're just out of university or read tech blogs or hackernews, you'd think everyone these days is doing TDD, pair programming and agile. The reality is that a lot of major businesses do none of that.


Re: automated testing. This can be quite time consuming to set up in many environments. Step one: get an Oracle database instance/schema that you can snapshot and roll back to a known starting state at will. Step two: get multiple connected app instances into a state to support each test run. Step three: just drop it, those things aren't gonna happen before you update your resume :-)

I'm ignoring the approach that says: Let's just spend a LOT of time writing mocks that stub out 90% of everything that matters, and pretend that the tests actually show something that matters.


No offense, but I think you're testing things all wrong. With any halfway-decent mocking library (which every major language I've encountered has), it's trivial to mock a HTTP client or ORM class or whatever and say "when called with foo, return bar." It should take no more code than it would to do the live data setup/rollback, and can be run in a completely isolated manner. If you only mock the calls that hit an external service, your tests will be more accurate because you've removed the chance of erroneous failures due to a problem with the external service. Spend the few hours it takes to learn to do it the right way once, and you'll never have to deal with all that nasty live setup/coordination/teardown again.


Agreed about integration tests sometimes being a pain in the... neck.

However, note I'm talking about something way more basic: many business don't do any kind of testing at all that isn't done by people manually trying the system. In the case of internal tools, this "testing" is often done directly by internal users trying to use the tool. This is way more common than one would suspect from reading tech blogs (unless one reads TheDailyWTF, of course).


I try to stress to clients that testing will always be done - we have a choice how much we do up front, behind the scenes, vs out in public, with real customers, money and data on the line. Still people choose, for a number of reasons, to have some/most/all of testing just be 'throw it out and see what happens'.


Because in a lot of ways those practices don't further the end goal or bottom line of the business. They don't meet business needs. As a developer, I think we both can agree that these things make our (developer) lives easier, but they often don't help the bottom line of the company. I know some dev will retort this, but if I was wrong then companies would be mandating these practices top down.


It's primarily perspective. I just came off a short term project which... yeah, it worked, but it's a disaster (but.. yeah, still working, technically).

4 years of no version control, no test environment for at least a year (it was shut down then never turned on again), no written docs, no written tests (and for periods of time, no backups). This software manages and controls 9 figures of USD annually.

The 'bottom line' was/is always "i need a form that does X" and "I need a form/report to show Y". Yes, the immediate short term goals of the business worked.

The entirety of the codebase has been hardcoded to work with library calls that were known tech-dead-ends years ago.

The business now needs someone else to come in and manage things, because the original guy left. The business is now up a proverbial creek, because everything was in someone else's head. TDD/testing/docs externalizes some of that, and does have real business value, but it requires someone to be able to look more than 6 months in to the future with respect to tech decisions.

Their bottom line will be hurt over the next year because they will be needing to pay far more to repair the tech debt, and it will mean fewer new things get done (or.. they'll pay even more for that). So... "business value" needs to be defined as to how you're measuring it, and 'short term' is always easier than 'long term'.


Yes. I have a friend who does QA for a major university and that department doesn't use source control at all.


Yes, my first job we have 3-5 devs at some points. We each had our own codebase (internal site, external site, console/windows apps). We never touched each others' code.

Nobody ever set up source control for the code I had to work on and nobody but me ever worked on it, the others did have source control as far as I know and at least one had 2 people working at a time. We did try to get my using it but it was clunky and didn't last long since it was just me using it.

It seems so strange in hindsight but that whole place was so so different from where I am now.


Almost all teams these days that I know of have a revision / source / version control system of some kind purchased and installed.

The "purchased" ones are usually worse than the open source ones (e.g. - Git, SVN et al).

The "ENTERPRISE!!!" devs don't use the system effectively, usually: haul things off to a corner for weeks before doing a single check-in/commit; may or may not know how to branch; bumble with breakage during merge/pull operations; end-run the system by releasing non-compiled files directly to production without saving them in source control.

Source control use is like the other comment about programming language: the main issue is not whether you have used brand X, but whether or not you have a coherent mental model of what the hell you are doing in the first place. That said, there are some BAD commercial source control systems out there, preying upon the "you need this, even/especially if you don't really know why" crowd. These are usually sold to The Management in terms of the control they bring to source control - they will have a trigger to force entry of a trouble/enhancement ticket to go with a check-in/commit. That fact that they suck at source management is of no consequence (e.g. - painfully slow, little to no command line interface for automation, poor research/discovery features, poor branch/merge support, unreliable / data corruption)


You have a chance to see all manners of known bad practices in the wild.

It's important to recognize them as bad. It's also important to try to understand what lead to their adoption; often there are ways to work around that and apply a best practice in your area, even if other areas still suffer without it.


"Must know tech X" is exactly the mentality that we try to avoid in our company's hiring process. Most people would claim this makes your job descriptions too flexible, but the demand is actually incredibly specific: developers who can jump on a project in a tech they haven't "used" before and succeed 90% of the time. It's probably the case that only 10% of devs can do that comfortably though. Fits the criteria in 'Who' perfectly!


> I'm also really not interested in how things work for "typical CRUD apps"

You mean you're not interested in what a huge part of the industry is doing? Well then that explains why you don't see the benefit.

There's tons of work creating user interfaces, transform some business logic and shove it back in a DB. No one said it was going to be a life-fulfilling experience. It's just work.

Now many of these shops want a native ui, mobile ui, responsive design and a back-end developer all in one. Many times they hire for specific frameworks so that you're ready to be productive day 1.

If I were staffing up a team to do a CRUD app, or even more complex, I'd staff up with JS developers.


I like doing CRUD apps, beyond the trivial you run into a terrific amount of business logic and thats an opportunity to really have an impact, one of my proudest moments as a programmer was taking a process that was largely manual and took 3 staff 2 days and making it something the boss could do in half an hour on a Friday afternoon by redefining the underlying assumptions about how their business worked - it's amazing how many businesses do <involved complex manual process> because that's how we've always done <involved complex manual process>.

It's 2017, We've had desktop computers for 35 years and most of the people I know working in various industries still manage a huge chunk of their work in email and excel because it's hard for their management to understand the benefits of automation or why they'd make an investment in the longer term with good software, I know of one company thats in the FTSE250 that manages it's entire pallet and tray inventory by using excel spreadsheets on a shared drive and sending people from the office to sites to look whats there (they order 100K+ Trays at a time).


"Where Excel spreadsheets are used to run a business, there is an opportunity to sell software" (quote from memory).


Yeah, but Excel can be pretty cheap compared to custom software.


Look for cases where the cost of errors is high, then pitch the reliability of a managed workflow in the software.

Otherwise, yeah, trivial stuff probably is done best with spreadsheets, wikis and ad-hoc emails.


Absolutely this.

Keeping track of some office supplies ordering, excel is fine.

Keeping track of 100K pallets where delays because they got 'lost' knock on right down the supply chain not so much.


> You mean you're not interested in what a huge part of the industry is doing? Well then that explains why you don't see the benefit.

Yes, you chopped off the important half of the sentence, and responded as if I hadn't written it. I said that I wasn't interesting because CRUD apps are already so easy to write in many different frameworks.

Let's assume that what you're saying is true—that most shops want all of those skills rolled into one person. The people I've met that are good at both front-end and back-end development are comfortable in multiple languages to begin with. The developers who only know JavaScript and are not comfortable in other languages are not good full stack developers.

But I don't believe that this is true to begin with, most places I've seen or the places where my friends work have people who focus on front-end and people who focus on back-end. There's just too much of a crazy front-end landscape to expect magic from someone who doesn't live in that world, and back-end is its own world.


> Yes, you chopped off the important half of the sentence

It doesn't matter if CRUD apps are easy to write, the point is there's market demand. And for some reason you write them off as trivial, but I've seen some business logic/requirements that'll make your head spin.

> The people I've met that are good at both front-end and back-end development are comfortable in multiple languages to begin with.

This type of position is becoming way more common. But I imagine many of them would prefer to work with one language anyways (I do). It keeps things simple. One language to be an expert on, that can be interviewed for, one testing library, one debugger, one set of coding standards. It's kind of a no brainer. I'd argue that almost any expert front-end dev can be a decent back-end developer.


I've worked on projects in which both the front and end back end were written in the same language & those that are different. The former were more productive imho with developers working on both and with shared libraries & shared test platforms. The latter often end up getting silo'd with stuff passed over the fence. Yes many developers can do both, but its difficult to do both at the same time & even harder to hire for.


Here's my experience as a solo developer for all of my projects. When I have to do everything from marketing, SEO, domain/dns, server setup, to database design, interface design, and whatever else. Doing most of the actual programming in one language, rather than many, helps considerably in my workflow. The NPM ecosystem is also awesome.


Bingo. Having one language for your front end, back end, build tools, database schema... It simplifies life a lot.

You can reuse code / modules.

You Don't suffer from context switching inherent with jumping between languages.

Also, people act like JavaScript is a bad language purely because it is not strictly typed. It's an incredibly fast and robust language with a vibrant ecosystem -- something I'd argue is more important than pure technical bullet points.


> It's an incredibly fast and robust language with a vibrant ecosystem

You will find support for the most random and amazing things in JS. It is truly mindblowing.

I recently had a client request to run a process involving extracting data from an Excel file. Originally wrote it in python using openpyxl http://openpyxl.readthedocs.io/en/default/, then it turned out the client also needed XLSB support. openpyxl and xlrd don't support XLSB, but the only thing that did work was a JS library http://oss.sheetjs.com/js-xlsx/ and node module


Yes, it's pretty much bananas. There is SO MUCH work being put into the ecosystem, that there really isn't much you can't do at this point. There is a node module for EVERYTHING. For all of JS's warts, it's fantastic to deal with.


For those that want to get more strict typing there are two great options in Typescript and Flow.


I like the dynamic + functional aspects of JS. I just wish that Node JS semantics were more like Erlang (which I have only admired from afar, alas - Java brain death forever!).

E.g. - concurrent operations in a single VM, but using actors exchanging immutable messages, NOT shared data and locks.


> Having one language for your front end, back end, build tools, database schema... It simplifies life a lot

If only it were a better one.

Maybe something that was designed from the beginning to be a general-purpose programming language, and not a hacked-together-in-a-weekend scripting language intended to add low-grade interactivity to HTML, that has bloated and twisted into the mess it has become.

I'm eagerly awaiting the advent of the WebAssembly era, and honest-to-goodness real compilers. Flying Spaghetti Monster be praised, then we may be able to leave some of this nonsense behind.


Node.js let's you require() packages from npm. There, all the code you for ever need.

Instead of writing "a quick Python script" for personal use, I now write "a quick node script" because there's almost no thinking required on my part. Just require the packages and throw them together.

Not even Python has this package-completeness.


Integrating a library requires a lot of thinking, or you'll integrate libraries that are immature and break, or don't exactly fit what you wanted and have to be hacked around. I've had to work in projects that used your methodology long-term and I would rather turn down a job than suffer through that again.


What do you mean by "package-completeness"?

While npm has a lot of packages, Python has a lot of fully matured, well-tested and well-documented packages.

It's apparent to me that there are a lot of low quality, duplicate and otherwise unstable npm packages.

Don't get me wrong, I actually like npm too but I don't think the Python comparison is even in the same ballpark.


The term for this is "Magpie-driven development".


Depends who you ask really.

If you are a developer then you should invest your time and be able to learn and use the best tool for the job.

If however you are something like a Product manager then learning JS is much more valuable time investment. It is the one language that will, currently, run anywhere ... Backend, Frontend, Side end, whatever.

You can quickly get good at JS to the point of being able to prototype the full app stack and come up with something that works, goto market and fail. Should you find your self in the unlikely position that your product meets its success, then you can go ahead and hire a Go/Elxir/WebASM whatever guy to optimize this and needs larger throughput.

You have 24 hours in day, 6 to sleep, 3 to breath, 1 to eat. How do you want to spend the remaining 14 ? You get good at what you practice :)


IMHO, the core piece is that supporting a language is very, very expensive: tooling, CI, linting--especially standards. Many of things that ensure quality are not language agnostic. It's not about learning a language, it's about maintaining one.

If you are a web shop you have javascript. Using node.js allows you to eliminate one extra language.

Is it perfect? Definitely not. Is it best for everything? Probably not the best for anything. But, it's good enough for most stuff. It doesn't make sense to incur the burden of another language for a one off task.


>"more people know JavaScript so you don't have to learn a new language, which saves you time".

This is also an excuse for NIH ( never invent here ) where people try to shoehorn an OSS product even if it doesn't fit their needs .


I know Javascript well enough to avoid it when possible.


ECMAScript6 is a wonderful language that approaches the syntactic readability of python. If you're stuck with ECMAScript5 sure, go ahead and say javascript is terrible because it is. But ECMAScript6? No way. They fixed so many annoying things. the fat arrow that captures "this", proper classes and inheritance, maps and sets and string template literals are the biggest ones that jump to mind. es6 is to javascript like what c++11 is to c++ -- it made it so you don't have to use the warts. ECMAScript6 snippet:

  class Shape {
      constructor (id, x, y) {
          this.id = id
          this.move(x, y)
      }

      move (x, y) {
          this.x = x
          this.y = y
      }

      fixed_offset(off_x,off_y) {
        return () => {
          console.log(`offsets x: ${off_x} y: ${off_y}`)
          return [this.x + off_x, this.y + off_y]
        };
      }
  }
And yes ECMAScript5 is gross:

  var Shape = function (id, x, y) {
      this.id = id;
      this.move(x, y);
  };
  Shape.prototype.move = function (x, y) {
      this.x = x;
      this.y = y;
  };
  Shape.prototype.fixed_offset(off_x,off_y) {
    var self = this;
    return function() {
      console.log("offsets x: " + off_x + " y: " + off_y`)
      return [self.x + off_x, self.y + off_y]
    }
  }


I do not care about the rearrangement of deck chairs on the Titanic that is ES6. I wouldn't care if JS were uglier than Erlang, if it worked as well.

The only people who care about ES6 classes are the ones who couldn't be bothered to learn how to use duck typed prototypes and were always trying to shoehorn in polymorphism. Prototypes were one of the few things Javascript had going for it, but adding classes means now there are two OO systems that don't interoperate. This is a new problem, not an improvement. There's a similar problem with callbacks and promises.

Other problems added in ES6: imports don't give an error when the thing I'm importing, continuing the JS tradition of handling errors by pretending everything is okay.

And this is in addition to the rotten core of JavaScript. There remains no integer type. Basic operators still take arguments of all types and return something regardless of whether it makes any sense. "this" still has little to do with this.

In any other language I'd complain about the lack of threading, but given JS can't handle easy stuff like implementing a sane comparison operator, it's probably better that they don't try anything legitimately difficult like threading.


Semicolons are not optional in JavaScript: ASI (http://www.ecma-international.org/ecma-262/6.0/#sec-automati...) is an error correction scheme for novice programmers. The spec's parsing rules calls out the statements following where a semicolon should be "offending tokens". There is no leeway here for style or preference.


and yet it also gives us this mess:

  import defaultMember from "module-name";
  import * as name from "module-name";
  import { member } from "module-name";
  import { member as alias } from "module-name";
  import { member1 , member2 } from "module-name";
  import { member1 , member2 as alias2 , [...] } from "module-name";
  import defaultMember, { member [ , [...] ] } from "module-name";
  import defaultMember, * as name from "module-name";
  import "module-name";

don't get me wrong, i think ES6 is much better than what we had before, but it still has a lot of warts


> As node.js is not multi-threaded, we spin up 4 instances of node.js per server, 1 instance per CPU core. Thus, we cache in-memory 4 times per server.

And why not use a shared memory server?

> Operations started adding rules with 100,000s of domains, which caused a single set of rules to be about 10mb large ... If we weren’t using node.js, we could cut this bandwidth by 4 as there would only be one connection to the Redis cluster retrieving rule sets.

Maybe a 10mb json string isn't the best design decision.....

Or you know, you could have one node process connect to the Redis server, and have the local processes read from a shared memory server.. Or you could not store your rules as a 10mb friggin JSON string..

> When rule sets were 10mb JSON strings, each node.js process would need to JSON.parse() the string every 30 seconds. We found that this actually blocked the event loop quite drastically

Well then do it in another thread and save it to shared memory. Maybe, just maybe, JSON strings aren't the tool for the job here.


While I don't know node.js very well and can't speak to the "shared memory server", I do think it is valid to point out that this sort of redesign to fit the platform is frustrating. I've hit similar situations in Python where an async i/o framework is used and at some point there becomes a CPU bound task. The application was redesigned and new microservices were started in order to cope and the architecture became convoluted. This sort of question always seems to always show up in Python where limitations of the GIL influences the design of the software, the libraries used and deployment. It seems node.js, not surprisingly, has the same problems.

As the above comment clearly points out, there are workarounds to make things work. Still, when I think about the amount of time and effort that goes into debugging systems when they hit limits like the author mentioned, I imagine most of the the benefits gained by using a language like Python or JavaScript go away.


It's not a Node.js limitation either. The design was not thought out from the beginning.

It has happened to me before. I design something, then I run it in multiple CPUS and it doesn't make sense anymore...

Or realize that if try to scale the current design doesn't hold.

Node.js running in a single thread is a good thing if you know how to work with it. Or if you really need multiple threads, use another PL.

Or you could write it in C++ and bind it to Node.js. And run it as a background process. Or spin up a worker? there seems to be multiple solutions here...


NodeJS is a multi-threaded process. You can verify it with top or ps command. Async methods to decode JSON: https://github.com/nodejs/node-v0.x-archive/issues/7543


The thread thing in nodejs seems very misunderstood. Only the javascript runs in one thread. I think its libuv that uses 4 threads to do most of the work in node.


Is that issue actually resolved?


While I agree Node.js isn't the right tool for any job -- just like anything else, really -- after reading his description of the problem, I can't shake off this feeling that the main issues he has with performance in this case have very little to do with Node itself. Parsing a huge JSON string in any language would block CPU for a while. This JSON then becomes a huge hash table in memory, so no wonder each process uses up a lot of RAM. I don't know how these rules are then used but it seems to me he might be better off trying to rethink how to do shared memory in this case before he simply blames Node for blocking CPU and wasting memory.

That said, I can imagine other languages (like Java or Go) could still end up being more efficient than Node.


The issue isn't that it takes a long time to parse the JSON. It's that the server can't do anything else while it's parsing. In Java, for example, you could parse the JSON on a background thread without affecting your ability to serve requests.

Similarly, the memory issue isn't so much that a single copy of the table takes a lot of space, but rather that they need to store 4 copies of the table -- because they're running 4 different processes in order to utilize multiple cores.

Both of these issues are specific to nodejs.


"Operations started adding rules with 100,000s of domains, which caused a single set of rules to be about 10mb large"

There's not enough detail to be sure, but this sounds more like "when a relational database would be a better idea than redis."

Edit: That is, pushing the evaluation of the rules down...rather than pulling a kv and walking 10MB (of JSON?) to get to the small number of rules that apply for the transaction.


This is an excellent article which really highlights the underlying trade-offs when you choose node for your service (i/o bound work vs cpu).

Unless you know for sure what limits you will hit - it makes sense to iterate quickly and find out. Then, if the service is actually hitting limits (and probably not the ones you thought) - re-write it in a multi-threaded concurrent language like go, elixr etc - or a language designed to solve the actual problems the service is hitting (which might be disk i/o or other infrastructure level things not language choice)


They could have fixed part of the architecture by having a "cache service" process (4 cpus: 3 for proxies, 1 for the cache service). With that they'd have a single point consuming their limited resources (memory, cpu and socket for redis connections), using IPC to communicate between process.


I thought the same or even the same 4 cpus for proxies and a shared 1 for the cache service.


JSON.parse() is one issue we faced regularly. Any large amount of data fetching could block the event loop and the whole server slows down. It's very unforgiving.

We go great length to figure out which attributes to fetch and add limits to all our sql queries. These are best practices but with node they are must.


There's always streaming json parsers[1] that will help you when parsing large json data sets.

[1]: https://github.com/dominictarr/JSONStream


Node.js isn't great for CPU bound tasks in general.


Parsing json is sneaky because it can show up in apps that are otherwise IO bound (where Node shines) and don't seem like they should be CPU intensive on the surface.


JavaScript is my first non-mathematical programming language and I haven’t found the need to expand my programming skills to more

-Having a hard time taking anything this guy says seriously


I debated between learning Node and Go for my latest project. I took a couple days doing beginner tutorials on each, and Go was actually a lot easier for me to learn. Could just be my background, but I know a couple other people who picked it up in about a week too, it's surprisingly simple.


> it's surprisingly simple.

Actually, that's its fundamental design. They reduced everything down to a very small core with very little features, so things would be obvious and you don't have to learn or remember much. It's refreshingly simple!

With that said, I'd say they reduced it too far down. There's no generics so you end up using `interface{}` everywhere which often leads to issues due to its late binding. Or you end up just using codegen tools, IIRC.

Also since there are no exceptions, you end up with constant checks for error code returns, which end up usually just being strings and not much else. Not saying exceptions are the best way to approach error handling, but they do allow you to reverse through an entire function call stack and clean up any state along the way, along with adding more granular error information you can write handlers to react to. Go reminds me of the pain that was C error handling and juggling error codes.


> On each server, rules are retrieved from Redis and cached in-memory using an LRU-cache. As node.js is not multi-threaded, we spin up 4 instances of node.js per server, 1 instance per CPU core. Thus, we cache in-memory 4 times per server. This is a waste of memory!

This is completely standard and the only way to do node in-memory caching. Think of each worker as a completely independent node process, which is only bound to the cluster by a master process which has the ability spawn and kill child cluster processes.


Seeing as Redis is already in-memory and has a LRU-cache feature, and he's already caching data from a database in Redis, the whole Node LRU seems awfully redundant and unnecessary.


> This is completely standard and the only way to do node in-memory caching.

This isn't accurate you can use shared memory. There are a few modules that implement this. In addition, you can offload the JSON.parse to the dedicated "caching" process that updates the shared memory.


Do you have a link that describes an example of this?

Ok nevermind, google is my friend: https://github.com/PaquitoSoft/memored

I can see where this would come in handy. But at 240MB total resident memory per CPU across 4 node workers that OP describes, I wouldn't hassle with it.


re: multiple processes duplicating memory, would a single menmcache instance or similar solve this problem? I don't have any perspective on how that would perform at scale vs individual programs reading from application state. Although thinking about it, each process would probably have to store all that data in app memory anyway...


It was an very unfortunate decision for Node devs to deep six multithreaded web workers. A pull request implementing it was ready to go with an optional flag to enable it but they did not want to support it. So node will be forever more compute bound to a single thread blocking all I/O.


> So node will be forever more compute bound to a single thread blocking all I/O.

I don't think I understood your point. But is not Node.js supposed to not block long lasting tasks?


I'd also add that node.js might not be the right choice for complex backend business logic with lots of service calls because of. Node.js' always-async execution model tends which tends to make things more complicated than need be.


I think "always async" is the main advantage of node.

My general (perhaps wrong) impression is that other languages commonly used in backends are moving toward async io, usually through maturing libraries.


It's not that I don't like async (I think it's a defensible choice for JavaScript), but that in my experience backend logic for e.g. e-commerce apps doesn't benefit from using it. Projects which expose services to web frontends are often implemented in an architecture where only one-shot, aggregated service calls (and often times REST-y services) are exposed to Node.js, with the actual business processing and granular service calls being implemented in eg. Java web services in a synchronous programming style. I actually like that architecture because it gives front end devs leaway to define their browser-facing head server backend, rather than enshrine a dogmatic frontend/backend architecture upfront.

It's true that other languages add async models (or emphasize those that they already have), but eg. in the case of Java you're sitting on 20 years of synchronous library and custom code, and it's not clear moving to async is worth it at this pont.


Most services that might be considered "backends" (ie databases, queues, cloud services) end up being in written in languages that can safely use async techniques for I/O and still use real threading or some other method for managing CPU bound problems.

Many of the applications people think of for node.js end up being "glue", much like python, and can live within this constraint for a very long time, where the I/O optimization is a nice benefit.


async / await allows you to choose how async you want your code to be/look, and I for one like to have a choice as opposed to say other languages that don't always have an async option.


C# has this same async await. The only difference - I can choose to use sync workflow by appending .Result to async function, ie `var response = asyncFunction().Result;` .....

Node.js is garbage (well, good for simple crud apps) and so is JS.


But with async/await in JS, you just prepend an `await` and you get the sync workflow.

That being said, I do like how C# handles it more than the way JS has handled it, but both are fantastic.

You can say JS is "garbage" all you want, but it's close to the most popular language on the planet and many many people are writing great things because of it.


"With lots of service calls" describes problems for which async is generally the most performant solution.


It could be, but if the service calls generally depend on the response from the previous service call. Or if you need to rate limit your calls to these services, I think its easier to code it not async


For batches of transactions where each transaction has a service call that depends on the response from a previous service call, async especially outperforms.

Rate limiting is not hard either way, but single-threaded async is slightly easier because your counter is automatically thread safe.

The larger and more complex your system gets the harder it gets to keep it thread-safe. For me this is the big advantage of single-threaded async, and I like node because this single-threaded async is idiomatic in JavaScript.


I strongly disagree. Making many async service calls concurrently and awaiting the result is the text book use case for node.js (well, event based programming in general).


"Usually" /snark


good detailed write up.

There were probably opportunities for the author to architect the system in ways that were better suited to node (given that was the chosen platform), but the architecture choices were not unreasonable by design. These are some good things to consider when architecting a system, and considering node as the platform.

I'm not sure I agree that node is the "perfect for simple CRUD apps" though


I think it's a lot closer to 10x as fast for Rust and 6-7x for Go.


When WebAssembly comes, what will that mean for the node.js ecosystem?


Nothing, WebAssembly targets the browser. Devs who already understand JS could just easily pick up Node.


Always.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: