Seeing as how node is single threaded, If I have node server running on an amazon EC2 instance with 4 EC2 Compute units will it run any faster / handle more load than if I have
In Node.js, your code is single-threaded, but calls that e.g. access the file system or a database server do not use the main node.js thread. The main thread keeps executing while other threads are waiting for 4GB to be read from disk to RAM or for the DB server to return a response. Once the action finishes, the supplied callback is put in a queue to execute in the main thread. More or less, anyway.
The advantage being that in a server situation, you have one very fast thread that can handle thousands of concurrent requests without putting any one entirely on hold or spawning an OS thread for each client request-response cycle.
More to the point, you should benchmark your specific use case on EC2 -- multiple processors may be useful when running a single instance of node if the app does a lot of IO.
The short answer to your question is that adding more cores in order to improve your node performance will not work, if all you do is write "standard" single threaded javascript (you will be bound by a single CPU).
The reason is that node.js uses an event loop for processing, so if all you are doing is starting up a single node.js process without anything else, it will not be multi-threaded and thus not use more than one CPU (core).
However, you can use the node.js cluster API to fork the node process so you can take advantage of multiple CPUs (cores): https://nodejs.org/docs/latest/api/cluster.html. If you write your code that way, then having more compute units will help you.
There is one caveat, in that EC2 compute units are detailed per instance. For some instances you can get more "compute units" per virtual core. So if you pick an instance that has 2 compute units per virtual core versus one that has one per core, you will be able to execute node on a CPU that has more compute units. However, it looks like after 2 compute units the computing power is split per core which means you won't get any benefit from the multiple cores.
To fully utilize compute resources of N cores, you need at least N threads ready to do useful work. This has nothing to do with EC2; it's just the way computers work. I assume from your question that you are choosing between the m1.medium
and m1.large
instance types, which have 1 and 2 dedicated cores, respectively (the m1.small
is half of a shared core, and the m1.xlarge
is the full dedicated 4-core box). Thus, you need at least 2 processes doing useful work in order to utilize the larger box (unless you just want access to more memory / io).
Each Node.js process is single threaded by design. This lets it provide a clean programming paradigm free of locking semantics. This is very much by design.
For a Node.js app to utilize multiple cores, it must spawn multiple processes. These processes would then use some form of messaging (pipes, sockets, etc) to communicate -- versus "shared memory" where code can directly mutate memory locations visible to multiple processes, something that would require locking semantics.
In practice, this is dirt simple easy to set up. Back in Node.JS v0.6.X the "cluster" module was integrated into the standard distribution, making it easy to set up multiple node workers that can listen on a single port. Note that this "cluster" module is NOT the same as the learnboost "cluster" module which has a different API and owns the "cluster" name in the NPMjs registry.
http://nodejs.org/docs/latest/api/cluster.html
if (cluster.isMaster) {
// Fork workers.
for (var i = 0; i < numCPUs; i++) {
cluster.fork();
}
} else {
http.Server(function(req, res) { ... }).listen(8000);
}
Amazon's concept of total "EC2 Compute Units" for an instance type does not map directly to a CPU or core. It is the number of cores multiplied by the speed of each core in EC2 compute units (their own relative measurement).
Amazon does list how many virtual cores each instance type has:
http://docs.amazonwebservices.com/AWSEC2/latest/UserGuide/index.html?instance-types.html
Your best option is to use all of the cores as others point out. However, if you end up with a single threaded solution, then you will want to focus on the speed of the individual cores, not the total EC2 compute units of all the cores added together.
If I have node server running on an amazon EC2 instance with 4 EC2 Compute units will it run any faster / handle more load than if I have 2 EC2 Compute units?
No, if you are using node.js in a server capacity you will only have access to a single core.
var http = require('http');
http.createServer(function (req, res) {
res.writeHead(200, {'Content-Type': 'text/plain'});
res.end('Hello World\n');
}).listen(1337, "127.0.0.1");
console.log('Server running at http://127.0.0.1:1337/');
Spawns a single listener, that doesn't mean only a single connection though. Node.js breaks conventional thought that way. The Event Loop will not block connections unless you code improperly. This post helps to explain the event loop and how important it is to understand it. Took me a while to really 'get' the implications.
Does CPU utilization on amazon require a program to be multithreaded to fully use all resources?
Yes, properly configured apache/nginx will take advantage of multi-cpu configurations. node.js servers are being developed that will also take advantage of these kind of configurations.