问题
I am trying to write a crawler to automatically download some files using python requests module. However, I met a problem.
I initialized a new requests session, then I used post method to login into the website, after that as long as I try to use post/get method (a simplified code below):
s=requests.session()
s.post(url,data=post_data, headers=headers)
#up to here everything is correct, the next step will report error
s.get(url) or s.post(url) even repeat s.post(url,data=post_data, headers=headers) will report error
It will report error like the one below:
Traceback (most recent call last):
File"/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/requests/packages/urllib3/connectionpool.py", line 372, in _make_request
httplib_response = conn.getresponse(buffering=True)
TypeError: getresponse() got an unexpected keyword argument 'buffering'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/requests/packages/urllib3/connectionpool.py", line 544, in urlopen
body=body, headers=headers)
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/requests/packages/urllib3/connectionpool.py", line 374, in _make_request
httplib_response = conn.getresponse()
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/http/client.py", line 1162, in getresponse
raise ResponseNotReady(self.__state)
http.client.ResponseNotReady: Request-sent
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/requests/adapters.py", line 370, in send
timeout=timeout
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/requests/packages/urllib3/connectionpool.py", line 597, in urlopen
_stacktrace=sys.exc_info()[2])
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/requests/packages/urllib3/util/retry.py", line 245, in increment
raise six.reraise(type(error), error, _stacktrace)
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/requests/packages/urllib3/packages/six.py", line 309, in reraise
raise value.with_traceback(tb)
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/requests/packages/urllib3/connectionpool.py", line 544, in urlopen
body=body, headers=headers)
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/requests/packages/urllib3/connectionpool.py", line 374, in _make_request
httplib_response = conn.getresponse()
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/http/client.py", line 1162, in getresponse
raise ResponseNotReady(self.__state)
requests.packages.urllib3.exceptions.ProtocolError: ('Connection aborted.', ResponseNotReady('Request-sent',))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "test.py", line 280, in <module>
test()
File "test.py", line 273, in test
emuch1.getEbook()
File "test.py", line 146, in getEbook
self.downloadEbook(ebook)
File "test.py", line 179, in downloadEbook
file_url=self.downloadEbookGetFileUrl(ebook).decode('gbk')
File "test.py", line 211, in downloadEbookGetFileUrl
download_url=self.downloadEbookGetUrl(ebook)
File "test.py", line 200, in downloadEbookGetUrl
respond_ebook=self.session.get(ebook_url)
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/requests/sessions.py", line 477, in get
return self.request('GET', url, **kwargs)
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/requests/sessions.py", line 465, in request
resp = self.send(prep, **send_kwargs)
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/requests/sessions.py", line 573, in send
r = adapter.send(request, **kwargs)
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/requests/adapters.py", line 415, in send
raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', ResponseNotReady('Request-sent',))
I have totally no idea why this happens, can anyone help me?
回答1:
Requests uses an internal version of urllib3. I have the impression that somehow there is a version mismatch between the internal urllib3 and requests itself.
httplib_response = conn.getresponse(buffering=True) TypeError: getresponse() got an unexpected keyword argument 'buffering'
Seems to indicate that requests is calling urllib3 (the internal version, not Python's), but wants to specify 'buffering' which does not exist.
The other problems are similar to what I experienced.
There are some other issues with the latest requests version (2.6.*) which are being resolved now. I suspect you are using that version. Try falling back on the previous version 2.4.1, or even 2.2.1. You can leave the last version installed if you specify, at the top of your program, which version you want to use:
__requires__ = ["requests==2.2.1"]
import pkg_resources
(at least before importing requests itself)
SOLUTION: Last week I exchanged several mails with the development team, and it looks like they've produced 2.7 with the fix very rapidly! (In fact I see it was uploaded only yesterday). So if you experience similar problems, download the latest version!
回答2:
The question has been solved by upgrading requests to the latest version, it might be the bug in 2.6 (maybe this version, not quite sure).
来源:https://stackoverflow.com/questions/30033516/single-session-multiple-post-get-in-python-requests