问题
getresponse
issues many recv
calls while reading header of an HTML request. It actually issues recv
for each byte which results in many system calls. How can it be optimized?
I verified on an Ubuntu machine with strace dump.
sample code:
conn = httplib.HTTPConnection("www.python.org")
conn.request("HEAD", "/index.html")
r1 = conn.getresponse()
strace dump:
sendto(3, "HEAD /index.html HTTP/1.1\r\nHost:"..., 78, 0, NULL, 0) = 78
recvfrom(3, "H", 1, 0, NULL, NULL) = 1
recvfrom(3, "T", 1, 0, NULL, NULL) = 1
recvfrom(3, "T", 1, 0, NULL, NULL) = 1
recvfrom(3, "P", 1, 0, NULL, NULL) = 1
recvfrom(3, "/", 1, 0, NULL, NULL) = 1
...
回答1:
r = conn.getresponse(buffering=True)
On Python 3.1+ there is no buffering
parameter (it is default).
来源:https://stackoverflow.com/questions/14519829/python-httplib-getresponse-issues-many-recv-calls