问题
The task I'm implementing is related to scrape some basic info about a URL, such as title, description and OGP metadata. If User A requests 200 URLs to scrape, and after User B requests for 10 URLs, User B may wait much more than s/he expect.
What I'm trying to achieve is to rate limit a specific task on a per user basis or, at least, to be fair between users.
The Celery implementation for rate limiting is too broad, since it uses the task name only
Do you have any suggestion to achieve this kind of fairness?
Related Celery (Django) Rate limiting
回答1:
Another way would be to rate limit individual users using a lock. Use the user id as the lock name. If the lock is already held retry after some task dependent delay.
Basically, do this:
Ensuring a task is only executed one at a time
Lock on the user id and retry instead of doing nothing if the lock can't be acquired. Also, it would be better to use Redis instead of the the Django cache, but either way will work.
回答2:
One way to work this around could be to control that a user does not enqueue more than x tasks, which means counting for each user the number of non-processed tasks enqueued (on the django side, not trying to do this with celery).
回答3:
How about, instead of running all URL scrapes in a single task, make each scrape into a single task and then run them as chains or groups?
来源:https://stackoverflow.com/questions/22759733/how-can-celery-distribute-users-tasks-in-a-fair-way