urllib2 Error 403: Forbidden

情到浓时终转凉″ 提交于 2019-12-31 06:24:12

问题


I have posted to this site and received really helpful guidance, i return with another question.

Where have i gone wrong here, I was prettty sure this is what is required to access information from various sites. In this case, the CME Group.

   import urllib2

url = "http://www.cmegroup.com/trading/energy/natural-gas/natural-gas.html"
request= urllib2.Request(url)
handle = urllib2.urlopen(request)
content = handle.read()
splitted_page = content.split("<span class=\"cmeSubHeading\">", 1);
splitted_page = splitted_page[1].split("</span>", 1)
print splitted_page[0]

Error reads,

HTTPError(req.get_full_url(), code, msg, hdrs, fp)
HTTPError: HTTP Error 403: Forbidden

Thank you greatly in advance.


回答1:


Actually the problem is that they block everyone who doesn't have a user-agent

import urllib2

request = urllib2.Request("http://www.cmegroup.com/trading/energy/natural-gas/natural-gas.html", None, {'User-Agent': 'Mozilla/5.0'})
content = urllib2.urlopen(request).read()
splitted_page = content.split("<span class=\"cmeSubHeading\">", 1);
splitted_page = splitted_page[1].split("</span>", 1)
print splitted_page[0]



回答2:


If you have to do GET requests I reccomend you the Requests python package. You can read its advantages in this post.

However, if you're getting a 403 message maybe you're trying to access some restricted data (Wikipedia link).



来源:https://stackoverflow.com/questions/26994972/urllib2-error-403-forbidden

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!