Here is what I\'m trying to do: I\'m developing a Node.js http server, which will hold long connections for pushing purpose(collaborate with redis) from tens of thousands of mob
Just a few notes:
Do you need to wrap res in an object {res: res} can you just assign it directly
pendingClinets[req.clientId] = res;
EDIT another ~micro optimization that might help
server.emit('request', req, res);
passes two arguments to 'request', but your request handler really only needs the response 'res'.
res['clientId'] = 'whatever';
server.emit('request', res);
while your amount of actual data remains the same, having 1 less argument in the 'request' handlers arguments list will save you a reference pointer (a few bytes). But a few bytes when you are processing hundreds of thousands of connections can add up. You'll also save the minor cpu overhead of processing the extra argument on the emit call.
I think you shouldn't worry about further decreasing memory usage. From that readout you included, it seems you're pretty close to the bare minimum conceivable (I interpret it as being in bytes, which is standard when a unit isn't specified).
This is a more in depth question than I can answer, but here's what RSS. The heap is where dynamically allocated memory comes from in unix systems, as best I understand. So, the heap total seems like it'd be all that is allocated on the heap for your usage, whereas the heap used is how much of what's allocated you've used.
Your memory usage is quite good, and it doesn't seem you actually have a leak. I wouldn't worry yet. =]
Don't know.
This snapshot seems reasonable. I expect some of the objects created from the surge of requests had been garbage collected, and others hadn't. You see there's nothing over 10k objects, and most of these objects are quite small. I call that good.
More importantly, though, I wonder how you're load testing this. I've tried to do massive load testing like this before, and most tools simply can't manage to generate that kind of load on linux, because of the limits on the number of open file descriptors (generally around a thousand per process by default). As well, once a socket is used, it is not immediately available for use again. It takes some significant fraction of a minute, as I recall, to be usable again. Between this and the fact that I've normally seen the system wide open file descriptor limit set somewhere under 100k, I'm not sure it's possible to receive that much load on an unmodified box, or to generate it on a single box. Since you don't mention any such steps, I think you might also need to investigate your load testing, to make sure it's doing what you think.