Python urllib freezes with specific URL

你离开我真会死。 提交于 2019-12-12 14:21:21

问题


I am trying to fetch a page and urlopen hangs and never returns anything, although the web page is very light and can be opened with any browser without any problems

import urllib.request
with urllib.request.urlopen("http://www.planalto.gov.br/ccivil_03/_Ato2007-2010/2008/Lei/L11882.htm") as response:
    print(response.read())

This simple code just freezes while retrieving the response, but if you try to open http://www.planalto.gov.br/ccivil_03/_Ato2007-2010/2008/Lei/L11882.htm it opens without any problem


回答1:


www.planalto.gov.br is using user-agent detection. If you specify a valid user-agent, the request fulfills correctly. The urllib library didn't crash, it's just waiting.

curl -H "User-Agent:Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.133 Safari/537.36" http://www.planalto.gov.br/ccivil_03/_Ato2007-2010/2008/Lei/L11882.htm

worked just fine for me but

curl http://www.planalto.gov.br/ccivil_03/_Ato2007-2010/2008/Lei/L11882.htm

did not.

Like RPGillespie said above, use urllib2 or requests to add the user-agent header (see How do I set headers using python's urllib? for more information about that).



来源:https://stackoverflow.com/questions/43987450/python-urllib-freezes-with-specific-url

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!