I am opening a file which has 100,000 URL\'s. I need to send an HTTP request to each URL and print the status code. I am using Python 2.6, and so far looked at the many con
Create epoll
object,
open many client TCP sockets,
adjust their send buffers to be a bit more than request header,
send a request header — it should be immediate, just placing into a buffer,
register socket in epoll
object,
do .poll
on epoll
obect,
read first 3 bytes from each socket from .poll
,
write them to sys.stdout
followed by \n
(don't flush),
close the client socket.
Limit number of sockets opened simultaneously — handle errors when sockets are created. Create a new socket only if another is closed.
Adjust OS limits.
Try forking into a few (not many) processes: this may help to use CPU a bit more effectively.