问题
I get the following an error when downloading html pages from the urls.
Error: raise URLError(err) urllib2.URLError: <urlopen error [Errno
10060] A connection attempt failed because the connected party did not
properly respond after a period of time or established connection
failed because connected host has failed to respond>
Code:
import urllib2
hdr = {'User-Agent': 'Mozilla/5.0'}
for i,site in enumerate(urls[index]):
print (site)
req = urllib2.Request(site, headers=hdr)
page = urllib2.build_opener(urllib2.HTTPCookieProcessor).open(req)
page_content = page.read()
with open(path_current+'/'+str(i)+'.html', 'w') as fid:
fid.write(page_content)
I think it may be due to some proxy settings or changing the timeout but I am not sure. Please help, I manually checked the urls seem to open perfectly fine.
回答1:
Well, since it doesn't happen to you most of the time, I can infer that your network is probably slow. Try to set the timeout in the following way:
req = urllib2.Request(site, headers=hdr)
timeout_in_sec = 360
page = urllib2.build_opener(urllib2.HTTPCookieProcessor).open(req, timeout=timeout_in_sec)
page_content = page.read()
来源:https://stackoverflow.com/questions/30373301/timeout-error-when-downloading-html-files-from-urls