Limiting number of HTTP requests per second on Python

南笙酒味 提交于 2019-12-05 03:08:37

问题


I've written a script that fetches URLs from a file and sends HTTP requests to all the URLs concurrently. I now want to limit the number of HTTP requests per second and the bandwidth per interface (eth0, eth1, etc.) in a session. Is there any way to achieve this on Python?


回答1:


You could use Semaphore object which is part of the standard Python lib: python doc

Or if you want to work with threads directly, you could use wait([timeout]).

There is no library bundled with Python which can work on the Ethernet or other network interface. The lowest you can go is socket.

Based on your reply, here's my suggestion. Notice the active_count. Use this only to test that your script runs only two threads. Well in this case they will be three because number one is your script then you have two URL requests.

import time
import requests
import threading

# Limit the number of threads.
pool = threading.BoundedSemaphore(2)

def worker(u):
    # Request passed URL.
    r = requests.get(u)
    print r.status_code
    # Release lock for other threads.
    pool.release()
    # Show the number of active threads.
    print threading.active_count()

def req():
    # Get URLs from a text file, remove white space.
    urls = [url.strip() for url in open('urllist.txt')]
    for u in urls:
        # Thread pool.
        # Blocks other threads (more than the set limit).
        pool.acquire(blocking=True)
        # Create a new thread.
        # Pass each URL (i.e. u parameter) to the worker function.
        t = threading.Thread(target=worker, args=(u, ))
        # Start the newly create thread.
        t.start()

req()



回答2:


You could use a worker concept like described in the documentation: https://docs.python.org/3.4/library/queue.html

Add a wait() command inside your workers to get them waiting between the requests (in the example from documentation: inside the "while true" after the task_done).

Example: 5 "Worker"-Threads with a waiting time of 1 sec between the requests will do less then 5 fetches per second.



来源:https://stackoverflow.com/questions/26098711/limiting-number-of-http-requests-per-second-on-python

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!