Python - Example of urllib2 asynchronous / threaded request using HTTPS

前端 未结 5 1372
梦谈多话
梦谈多话 2021-02-02 13:50

I\'m having a heck of a time getting asynchronous / threaded HTTPS requests to work using Python\'s urllib2.

Does anyone out there have a basic example that implements u

相关标签:
5条回答
  • 2021-02-02 14:26

    there's a really simple way, involving a handler for urllib2, which you can find here: http://pythonquirks.blogspot.co.uk/2009/12/asynchronous-http-request.html

    #!/usr/bin/env python
    
    import urllib2
    import threading
    
    class MyHandler(urllib2.HTTPHandler):
        def http_response(self, req, response):
            print "url: %s" % (response.geturl(),)
            print "info: %s" % (response.info(),)
            for l in response:
                print l
            return response
    
    o = urllib2.build_opener(MyHandler())
    t = threading.Thread(target=o.open, args=('http://www.google.com/',))
    t.start()
    print "I'm asynchronous!"
    
    t.join()
    
    print "I've ended!"
    
    0 讨论(0)
  • 2021-02-02 14:29

    The code below does 7 http requests asynchronously at the same time. It does not use threads, instead it uses asynchronous networking with the twisted library.

    from twisted.web import client
    from twisted.internet import reactor, defer
    
    urls = [
     'http://www.python.org', 
     'http://stackoverflow.com', 
     'http://www.twistedmatrix.com', 
     'http://www.google.com',
     'http://launchpad.net',
     'http://github.com',
     'http://bitbucket.org',
    ]
    
    def finish(results):
        for result in results:
            print 'GOT PAGE', len(result), 'bytes'
        reactor.stop()
    
    waiting = [client.getPage(url) for url in urls]
    defer.gatherResults(waiting).addCallback(finish)
    
    reactor.run()
    
    0 讨论(0)
  • 2021-02-02 14:30

    here is an example using urllib2 (with https) and threads. Each thread cycles through a list of URL's and retrieves the resource.

    import itertools
    import urllib2
    from threading import Thread
    
    
    THREADS = 2
    URLS = (
        'https://foo/bar',
        'https://foo/baz',
        )
    
    
    def main():
        for _ in range(THREADS):
            t = Agent(URLS)
            t.start()
    
    
    class Agent(Thread):
        def __init__(self, urls):
            Thread.__init__(self)
            self.urls = urls
    
        def run(self):
            urls = itertools.cycle(self.urls)
            while True:
                data = urllib2.urlopen(urls.next()).read()
    
    
    if __name__ == '__main__':
        main()
    
    0 讨论(0)
  • 2021-02-02 14:31

    You can use asynchronous IO to do this.

    requests + gevent = grequests

    GRequests allows you to use Requests with Gevent to make asynchronous HTTP Requests easily.

    import grequests
    
    urls = [
        'http://www.heroku.com',
        'http://tablib.org',
        'http://httpbin.org',
        'http://python-requests.org',
        'http://kennethreitz.com'
    ]
    
    rs = (grequests.get(u) for u in urls)
    grequests.map(rs)
    
    0 讨论(0)
  • 2021-02-02 14:32

    here is the code from eventlet

    urls = ["http://www.google.com/intl/en_ALL/images/logo.gif",
         "https://wiki.secondlife.com/w/images/secondlife.jpg",
         "http://us.i1.yimg.com/us.yimg.com/i/ww/beta/y3.gif"]
    
    import eventlet
    from eventlet.green import urllib2
    
    def fetch(url):
    
      return urllib2.urlopen(url).read()
    
    pool = eventlet.GreenPool()
    
    for body in pool.imap(fetch, urls):
      print "got body", len(body)
    
    0 讨论(0)
提交回复
热议问题