I\'m working on a Python library that interfaces with a web service API. Like many web services I\'ve encountered, this one requests limiting the rate of requests. I would l
Your rate limiting scheme should be heavily influenced by the calling conventions of the underlying code (syncronous or async), as well as what scope (thread, process, machine, cluster?) this rate-limiting will operate at.
I would suggest keeping all the variables within the instance, so you can easily implement multiple periods/rates of control.
Lastly, it sounds like you want to be a middleware component. Don't try to be an application and introduce threads on your own. Just block/sleep if you are synchronous and use the async dispatching framework if you are being called by one of them.
SO I am assuming something simple like import time time.sleep(2) will not work for waiting 2 seconds between requests
This works out better with a queue and a dispatcher.
You split your processing into two sides: source and dispatch. These can be separate threads (or separate processes if that's easier).
The Source side creates and enqueues requests at whatever rate makes them happy.
The Dispatch side does this.
Get the request start time, s.
Dequeues a request, process the request through the remote service.
Get the current time, t. Sleep for rate - (t - s) seconds.
If you want to run the Source side connected directly to the remote service, you can do that, and bypass rate limiting. This is good for internal testing with a mock version of the remote service.
The hard part about this is creating some representation for each request that you can enqueue. Since the Python Queue will handle almost anything, you don't have to do much.
If you're using multi-processing, you'll have to pickle your objects to put them into a pipe.
Queuing may be overly complicated. A simpler solution is to give your class a variable for the time the service was last called. Whenever the service is called (!1), set waitTime to delay - Now + lastcalltime
. delay
should be equal to the minimum allowable time between requests. If this number is positive, sleep for that long before making the call (!2). The disadvantage/advantage of this approach is that it treats the web service requests as being synchronous. The advantage is that it is absurdly simple and easy to implement.
S.Lott's solution is more elegant, of course.
Don't reinvent the wheel, unless it's called for. Check the awesome library ratelimit. Perfect if you just want to rate limit your calls to an rest api for whatever reason and get on with your life.
If your library is designed to be synchronous, then I'd recommend leaving out the limit enforcement (although you could track rates and at least help the caller decide how to honor limits).
I use twisted to interface with pretty much everything nowadays. It makes it easy to do that type of thing by having a model that separates request submission from response handling. If you don't want your API users to have to use twisted, you'd at least be better off understanding their API for deferred execution.
For example, I have a twitter interface that pushes a rather absurd number of requests through on behalf of xmpp users. I don't rate limit, but I did have to do a bit of work to prevent all of the requests from happening at the same time.