urllib.urlopen works but urllib2.urlopen doesn't

此生再无相见时 提交于 2019-12-03 12:31:56
John Millikin

Sounds like you have proxy settings defined that urllib2 is picking up on. When it tries to proxy "127.0.0.01/", the proxy gives up and returns a 504 error.

From Obscure python urllib2 proxy gotcha:

proxy_support = urllib2.ProxyHandler({})
opener = urllib2.build_opener(proxy_support)
print opener.open("http://127.0.0.1").read()

# Optional - makes this opener default for urlopen etc.
urllib2.install_opener(opener)
print urllib2.urlopen("http://127.0.0.1").read()

Does calling urlib2.open first followed by urllib.open have the same results? Just wondering if the first call to open is causing the http server to get busy causing the timeout?

I don't know what's going on, but you may find this helpful in figuring it out:

>>> import urllib2
>>> urllib2.urlopen('http://mit.edu').read()[:10]
'<!DOCTYPE '
>>> urllib2._opener.handlers[1].set_http_debuglevel(100)
>>> urllib2.urlopen('http://mit.edu').read()[:10]
connect: (mit.edu, 80)
send: 'GET / HTTP/1.1\r\nAccept-Encoding: identity\r\nHost: mit.edu\r\nConnection: close\r\nUser-Agent: Python-urllib/2.5\r\n\r\n'
reply: 'HTTP/1.1 200 OK\r\n'
header: Date: Tue, 14 Oct 2008 15:52:03 GMT
header: Server: MIT Web Server Apache/1.3.26 Mark/1.5 (Unix) mod_ssl/2.8.9 OpenSSL/0.9.7c
header: Last-Modified: Tue, 14 Oct 2008 04:02:15 GMT
header: ETag: "71d3f96-2895-48f419c7"
header: Accept-Ranges: bytes
header: Content-Length: 10389
header: Connection: close
header: Content-Type: text/html
'<!DOCTYPE '

urllib.urlopen() throws the following request at the server:

GET / HTTP/1.0
Host: 127.0.0.1
User-Agent: Python-urllib/1.17

while urllib2.urlopen() throws this:

GET / HTTP/1.1
Accept-Encoding: identity
Host: 127.0.0.1
Connection: close
User-Agent: Python-urllib/2.5

So, your server either doesn't understand HTTP/1.1 or the extra header fields.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!