HTTP Error 403: Forbidden with urlretrieve

£可爱£侵袭症+ 提交于 2019-12-06 14:18:16

You seem to have already realised this; the remote server is apparently checking the user agent header and rejecting requests from Python's urllib. But urllib.request.urlretrieve() doesn't allow you to change the HTTP headers, however, you can use urllib.request.URLopener.retrieve():

import urllib.request

opener = urllib.request.URLopener()
opener.addheader('User-Agent', 'whatever')
filename, headers = opener.retrieve(url, 'Test.pdf')

N.B. You are using Python 3 and these functions are now considered part of the "Legacy interface", and URLopener has been deprecated. For that reason you should not use them in new code.

The above aside, you are going to a lot of trouble to simply access a URL. Your code imports requests, but you don't use it - you should though because it is much easier than urllib. This works for me:

import requests

url = 'http://papers.xtremepapers.com/CIE/Cambridge%20IGCSE/Mathematics%20(0580)/0580_s03_qp_1.pdf'
r = requests.get(url)
with open('0580_s03_qp_1.pdf', 'wb') as outfile:
    outfile.write(r.content)
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!