timeout for urllib2.urlopen() in pre Python 2.6 versions

情到浓时终转凉″ 提交于 2019-11-30 04:27:27

you can set a global timeout for all socket operations (including HTTP requests) by using:

socket.setdefaulttimeout()

like this:

import urllib2
import socket
socket.setdefaulttimeout(30)
f = urllib2.urlopen('http://www.python.org/')

in this case, your urllib2 request would timeout after 30 secs and throw a socket exception. (this was added in Python 2.3)

With considerable irritation, you can override the httplib.HTTPConnection class that the urllib2.HTTPHandler uses.

def urlopen_with_timeout(url, data=None, timeout=None):

  # Create these two helper classes fresh each time, since
  # timeout needs to be in the closure.
  class TimeoutHTTPConnection(httplib.HTTPConnection):
    def connect(self):
      """Connect to the host and port specified in __init__."""
      msg = "getaddrinfo returns an empty list"
      for res in socket.getaddrinfo(self.host, self.port, 0,
                      socket.SOCK_STREAM): 
        af, socktype, proto, canonname, sa = res
        try:
          self.sock = socket.socket(af, socktype, proto)
          if timeout is not None:
            self.sock.settimeout(timeout)
          if self.debuglevel > 0:
            print "connect: (%s, %s)" % (self.host, self.port)
          self.sock.connect(sa)
        except socket.error, msg:
          if self.debuglevel > 0:
            print 'connect fail:', (self.host, self.port)
          if self.sock:
            self.sock.close()
          self.sock = None
          continue
        break
      if not self.sock:
        raise socket.error, msg

  class TimeoutHTTPHandler(urllib2.HTTPHandler):
    http_request = urllib2.AbstractHTTPHandler.do_request_
    def http_open(self, req):
      return self.do_open(TimeoutHTTPConnection, req)

  opener = urllib2.build_opener(TimeoutHTTPHandler)
  opener.open(url, data)

I think your best choice is to patch (or deploy an local version of) your urllib2 with the change from the 2.6 maintenance branch

The file should be in /usr/lib/python2.4/urllib2.py (on linux and 2.4)

I use httplib from the standard library. It has a dead simple API, but only handles http as you might guess. IIUC urllib uses httplib to implement the http stuff.

You must set timeout in two places.

import urllib2
import socket

socket.setdefaulttimeout(30)
f = urllib2.urlopen('http://www.python.org/', timeout=30)

Well, the way timeout is handled in either 2.4 or 2.6 is the same. If you open the urllib2.py file in 2.6 u would see that it takes an extra argument as timeout and handles it using the socket.defaulttimeout() method as mentioned is answer 1.

So you really need not update your urllib2.py in that case.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!