I am currently designing a web application that will allow users to schedule tasks which will be executed against an HTTP API (on behalf of them). The tasks can be recurring and the minimal time resolution that can be used for scheduling will be one minute. Because of the nature of the tasks I think it makes sense to execute them asynchronously. However, how should the architecture of this part look like?
I thought about using a task queue to create tasks by the web application and let them be executed by a worker. In this case, I have several questions:
- How do I handle recurring tasks?
- How do I easily save the results of the tasks?
- Is it easily possible to make the queue "persistent"?
- Should the workers directly interact with a database?
- Should I queue recurring tasks manually up?
What else could I consider? Since I assume I am not the only one thinking about this kind of web application architecture, are there any "best practices"? Is a task queue the way to go?
Yes this is a well-known pattern for handling long-lived tasks at the back end of a web application. Depending on your langauge and application framework there are a number of queue implementations out there - e.g. Resque or Beanstalkd or ActiveMQ or if your performance requirements are not high you can use a database table as a kind of queue.
The basic idea is your Web application places jobs on a queue with enough content to enable to job to proceed. A group of worker processes in the background (ideally running independent of your web application) read the jobs off the queue and execute them. The results can be written back onto a reply queue or perhaps written into a database. It depends on how you want to display the results back to the user. For a web application, writing results into the database probably makse more sense.
Depending on your queue handler, they can make jobs persistent. E.g. ActiveMQ supports persistent messaging so that messages on a queue are recovered in the event of a failure.
You ask about recurring jobs - and I think the answer depends on when they need to recur.
A straight message queue will process/release messages to workers as soon as they become available. So scheduling is tricky or impossible. To support scheduled jobs (including jobs which recur at a given time or after a time elapse) you should probably look at using a database table as a simple queue with a "start time" attribute.
I recently described a similar pattern here.
来源:https://stackoverflow.com/questions/5840777/web-application-architecture-job-task-queue-needed