问题
To read a content from a given URL I do the following:
import requests
proxies = {'http':'http://user:pswd@foo-webproxy.foo.com:7777'}
url = 'http://example.com/foo/bar'
r = requests.get(url, proxies = proxies)
print r.text.encode('utf-8')
And it works fine! I get the content.
However, if I use another URL:
url = 'https://en.wikipedia.org/wiki/Mestisko'
It does not work. I get an error message that starts with:
requests.exceptions.ConnectionError: ('Connection aborted.', error(10060
Is Wikipedia blocking automatic requests?
ADDED
I tried to set a user agent in the following way:
headers = {'User-Agent':'Mozilla/5.0'}
r = requests.get(url, proxies = proxies, headers = headers)
Unfortunately it does not help. I still get the same error.
ADDED 2
Now I am confused. If I try to get content from http://example.com/foo/bar
with setting proxy, I get it. If I do not set proxy, I get content generated by proxy. This behavior I can understand. Now, if I try to get content from Wikipedia, I get the same error message independently on whether I set or do not set proxy. So, I do not understand where this error message comes from Wikipedia or proxy (both options cannot be true).
回答1:
The problem was resolved by replacing:
proxies = {'http':'http://user:pswd@foo-webproxy.foo.com:7777'}
with the following line:
proxies = {'http':'http://user:pswd@foo-webproxy.foo.com:7777', 'https':'http://user:pswd@foo-webproxy.foo.com:7777'}
来源:https://stackoverflow.com/questions/34092501/is-it-possible-to-read-wikipedia-using-python-requests-library