What are the differences between the urllib, urllib2, urllib3 and requests module?

后端 未结 11 2264
隐瞒了意图╮
隐瞒了意图╮ 2020-11-22 04:19

In Python, what are the differences between the urllib, urllib2, urllib3 and requests modules? Why are there three? They seem to do the same thing...

相关标签:
11条回答
  • 2020-11-22 04:43

    I like the urllib.urlencode function, and it doesn't appear to exist in urllib2.

    >>> urllib.urlencode({'abc':'d f', 'def': '-!2'})
    'abc=d+f&def=-%212'
    
    0 讨论(0)
  • 2020-11-22 04:44

    This is my understanding of what the relations are between the various "urllibs":

    In the Python 2 standard library there exist two HTTP libraries side-by-side. Despite the similar name, they are unrelated: they have a different design and a different implementation.

    • urllib was the original Python HTTP client, added to the standard library in Python 1.2.
    • urllib2 was a more capable HTTP library, added in Python 1.6, intended to be eventually a replacement for urllib.

    The Python 3 standard library has a new urllib, that is a merged/refactored/rewritten version of those two packages.

    urllib3 is a third-party package. Despite the name, it is unrelated to the standard library packages, and there is no intention to include it in the standard library in the future.

    Finally, requests internally uses urllib3, but it aims for an easier-to-use API.

    0 讨论(0)
  • 2020-11-22 04:47

    One considerable difference is about porting Python2 to Python3. urllib2 does not exist for python3 and its methods ported to urllib. So you are using that heavily and want to migrate to Python3 in future, consider using urllib. However 2to3 tool will automatically do most of the work for you.

    0 讨论(0)
  • 2020-11-22 04:50

    You should generally use urllib2, since this makes things a bit easier at times by accepting Request objects and will also raise a URLException on protocol errors. With Google App Engine though, you can't use either. You have to use the URL Fetch API that Google provides in its sandboxed Python environment.

    0 讨论(0)
  • 2020-11-22 04:51

    I think all answers are pretty good. But fewer details about urllib3.urllib3 is a very powerful HTTP client for python. For installing both of the following commands will work,

    urllib3

    using pip,

    pip install urllib3
    

    or you can get the latest code from Github and install them using,

    $ git clone git://github.com/urllib3/urllib3.git
    $ cd urllib3
    $ python setup.py install
    

    Then you are ready to go,

    Just import urllib3 using,

    import urllib3
    

    In here, Instead of creating a connection directly, You’ll need a PoolManager instance to make requests. This handles connection pooling and thread-safety for you. There is also a ProxyManager object for routing requests through an HTTP/HTTPS proxy Here you can refer to the documentation. example usage :

    >>> from urllib3 import PoolManager
    >>> manager = PoolManager(10)
    >>> r = manager.request('GET', 'http://google.com/')
    >>> r.headers['server']
    'gws'
    >>> r = manager.request('GET', 'http://yahoo.com/')
    >>> r.headers['server']
    'YTS/1.20.0'
    >>> r = manager.request('POST', 'http://google.com/mail')
    >>> r = manager.request('HEAD', 'http://google.com/calendar')
    >>> len(manager.pools)
    2
    >>> conn = manager.connection_from_host('google.com')
    >>> conn.num_requests
    3
    

    As mentioned in urrlib3 documentations,urllib3 brings many critical features that are missing from the Python standard libraries.

    • Thread safety.
    • Connection pooling.
    • Client-side SSL/TLS verification.
    • File uploads with multipart encoding.
    • Helpers for retrying requests and dealing with HTTP redirects.
    • Support for gzip and deflate encoding.
    • Proxy support for HTTP and SOCKS.
    • 100% test coverage.

    Follow the user guide for more details.

    • Response content (The HTTPResponse object provides status, data, and header attributes)
    • Using io Wrappers with Response content
    • Creating a query parameter
    • Advanced usage of urllib3

    requests

    requests uses urllib3 under the hood and make it even simpler to make requests and retrieve data. For one thing, keep-alive is 100% automatic, compared to urllib3 where it's not. It also has event hooks which call a callback function when an event is triggered, like receiving a response In requests, each request type has its own function. So instead of creating a connection or a pool, you directly GET a URL.


    For install requests using pip just run

    pip install requests

    or you can just install from source code,

    $ git clone git://github.com/psf/requests.git
    $ cd requests
    $ python setup.py install
    

    Then, import requests

    Here you can refer the official documentation, For some advanced usage like session object, SSL verification, and Event Hooks please refer to this url.

    0 讨论(0)
  • 2020-11-22 04:52

    urllib2 provides some extra functionality, namely the urlopen() function can allow you to specify headers (normally you'd have had to use httplib in the past, which is far more verbose.) More importantly though, urllib2 provides the Request class, which allows for a more declarative approach to doing a request:

    r = Request(url='http://www.mysite.com')
    r.add_header('User-Agent', 'awesome fetcher')
    r.add_data(urllib.urlencode({'foo': 'bar'})
    response = urlopen(r)
    

    Note that urlencode() is only in urllib, not urllib2.

    There are also handlers for implementing more advanced URL support in urllib2. The short answer is, unless you're working with legacy code, you probably want to use the URL opener from urllib2, but you still need to import into urllib for some of the utility functions.

    Bonus answer With Google App Engine, you can use any of httplib, urllib or urllib2, but all of them are just wrappers for Google's URL Fetch API. That is, you are still subject to the same limitations such as ports, protocols, and the length of the response allowed. You can use the core of the libraries as you would expect for retrieving HTTP URLs, though.

    0 讨论(0)
提交回复
热议问题