python requests module and connection reuse

前端 未结 2 1820
一整个雨季
一整个雨季 2020-12-05 10:15

I am working with python\'s requests module for HTTP communication, and I am wondering how to reuse already-established TCP connections? The requests module is stateless and

相关标签:
2条回答
  • 2020-12-05 11:00

    Global functions like requests.get or requests.post create the requests.Session instance on each call. Connections made with these functions cannot be reused, because you cannot access automatically created session and use it's connection pool for subsequent requests. It's fine to use these functions if you have to do just a few requests. Otherwise you'll want to manage sessions yourself.

    Here is a quick display of requests behavior when you use global get function and session.

    Preparation, not really relevant to the question:

    >>> import logging, requests, timeit
    >>> logging.basicConfig(level=logging.DEBUG, format="%(message)s")
    

    See, a new connection is established each time you call get:

    >>> _ = requests.get("https://www.wikipedia.org")
    Starting new HTTPS connection (1): www.wikipedia.org
    >>> _ = requests.get("https://www.wikipedia.org")
    Starting new HTTPS connection (1): www.wikipedia.org
    

    But if you use the same session for subsequent calls, the connection gets reused:

    >>> session = requests.Session()
    >>> _ = session.get("https://www.wikipedia.org")
    Starting new HTTPS connection (1): www.wikipedia.org
    >>> _ = session.get("https://www.wikipedia.org")
    >>> _ = session.get("https://www.wikipedia.org")
    >>> _ = session.get("https://www.wikipedia.org")
    

    Performance:

    >>> timeit.timeit('_ = requests.get("https://www.wikipedia.org")', 'import requests', number=100)
    Starting new HTTPS connection (1): www.wikipedia.org
    Starting new HTTPS connection (1): www.wikipedia.org
    Starting new HTTPS connection (1): www.wikipedia.org
    ...
    Starting new HTTPS connection (1): www.wikipedia.org
    Starting new HTTPS connection (1): www.wikipedia.org
    Starting new HTTPS connection (1): www.wikipedia.org
    52.74904417991638
    >>> timeit.timeit('_ = session.get("https://www.wikipedia.org")', 'import requests; session = requests.Session()', number=100)
    Starting new HTTPS connection (1): www.wikipedia.org
    15.770191192626953
    

    Works much faster when you reuse the session (and thus session's connection pool).

    0 讨论(0)
  • 2020-12-05 11:10

    The requests module is stateless and if I repeatedly call get for the same URL, wouldnt it create a new connection each time?

    The requests module is not stateless; it just lets you ignore the state and effectively use a global singleton state if you choose to do so.*

    And it (or, rather, one of the underlying libraries, urllib3) maintains a connection pool keyed by (hostname, port) pair, so it will usually just magically reuse a connection if it can.

    As the documentation says:

    Excellent news — thanks to urllib3, keep-alive is 100% automatic within a session! Any requests that you make within a session will automatically reuse the appropriate connection!

    Note that connections are only released back to the pool for reuse once all body data has been read; be sure to either set stream to False or read the content property of the Response object.

    So, what does "if it can" mean? As the docs above imply, if you're keeping streaming response objects alive, their connections obviously can't be reused.

    Also, the connection pool is really a finite cache, not infinite, so if you spam out a ton of connections and two of them are to the same server, you won't always reuse the connection, just often. But usually, that's what you actually want.


    * The particular state relevant here is the transport adapter. Each session gets a transport adapter. You can specify the adapter manually, or you can specify a global default, or you can just use the default global default, which basically just wraps up a urllib3.PoolManager for managing its HTTP connections. For more information, read the docs.

    0 讨论(0)
提交回复
热议问题