Node.js looks interesting, BUT I must miss something - isn\'t Node.js tuned only to run on a single process and thread?
Then how does it scale for m
One method would be to run multiple instances of node.js on the server and then put a load balancer (preferably a non-blocking one like nginx) in front of them.
As mentioned above, Cluster will scale and load-balance your app across all cores.
adding something like
cluster.on('exit', function () {
cluster.fork();
});
Will restart any failing workers.
These days, a lot of people also prefer PM2, which handles the clustering for you and also provides some cool monitoring features.
Then, add Nginx or HAProxy in front of several machines running with clustering and you have multiple levels of failover and a much higher load capacity.
Node.js absolutely does scale on multi-core machines.
Yes, Node.js is one-thread-per-process. This is a very deliberate design decision and eliminates the need to deal with locking semantics. If you don't agree with this, you probably don't yet realize just how insanely hard it is to debug multi-threaded code. For a deeper explanation of the Node.js process model and why it works this way (and why it will NEVER support multiple threads), read my other post.
Two ways:
Since v6.0.X Node.js has included the cluster module straight out of the box, which makes it easy to set up multiple node workers that can listen on a single port. Note that this is NOT the same as the older learnboost "cluster" module available through npm.
if (cluster.isMaster) {
// Fork workers.
for (var i = 0; i < numCPUs; i++) {
cluster.fork();
}
} else {
http.Server(function(req, res) { ... }).listen(8000);
}
Workers will compete to accept new connections, and the least loaded process is most likely to win. It works pretty well and can scale up throughput quite well on a multi-core box.
If you have enough load to care about multiple cores, then you are going to want to do a few more things too:
Run your Node.js service behind a web-proxy like Nginx or Apache - something that can do connection throttling (unless you want overload conditions to bring the box down completely), rewrite URLs, serve static content, and proxy other sub-services.
Periodically recycle your worker processes. For a long-running process, even a small memory leak will eventually add up.
Setup log collection / monitoring
PS: There's a discussion between Aaron and Christopher in the comments of another post (as of this writing, its the top post). A few comments on that:
Shared Ports: nginx (port 80) --> Node_workers x N (sharing port 3000 w/ Cluster)
vs
Individual Ports: nginx (port 80) --> {Node_worker (port 3000), Node_worker (port 3001), Node_worker (port 3002), Node_worker (port 3003) ...}
There are arguably some benefits to the individual ports setup (potential to have less coupling between processes, have more sophisticated load-balancing decisions, etc.), but it is definitely more work to set up and the built-in cluster module is a low-complexity alternative that works for most people.