问题
I am currently struggle to find a solution for implement a specific kind of queue, which require the following traits:
- All queue must respect the order that job were added.
- The whole queue will have a concurrency of 1, which means that there will only be one job execute at a time per queue, not worker.
- There will be more than a few thousand queue like this.
- It need to be distributed and be able to scale (example if I add a worker)
Basically it is a single process FIFO queue, and this is exactly what I want when tryout different message queue software like ActiveMQ or RabbitMQ, but as soon as I scale it to 2 worker, it just does not work since in this case I want it to scale and maintain exact same feature of single process queue. Below I attach the description of how it should work in a distributed environment with multiple worker.
Example of how the topology looks like: (Note that it's a many to many relationship between the Queue and Workers)
Example of how it would run:
+------+-----------------+-----------------+-----------------+
| Step | Worker 1 | Worker 2 | Worker 3 |
+------+-----------------+-----------------+-----------------+
| 1 | Fetch Q/1/Job/1 | Fetch Q/2/Job/1 | Waiting |
+------+-----------------+-----------------+-----------------+
| 2 | Running | Running | Waiting |
+------+-----------------+-----------------+-----------------+
| 3 | Running | Done Q/2/Job/1 | Fetch Q/2/Job/2 |
+------+-----------------+-----------------+-----------------+
| 4 | Done Q/1/Job/1 | Fetch Q/1/Job/2 | Running |
+------+-----------------+-----------------+-----------------+
| 5 | Waiting | Running | Running |
+------+-----------------+-----------------+-----------------+
Probably this is not the best representation but it show that, even in the Queue 1 and Queue 2, there are more jobs, but Worker 3 does not start fetching the next job until the previous one finish.
This is what I struggle to find a good solution.
I have tried a lot of other solution like rabbitMQ, activeMQ, apollo... These allow me to create thousand of queues, but all of them as i try out, will use worker 3 to run the next job in queue. And the concurrency is per worker
Are there any solution out there that can make this possible in any MQ platform, example ActiveMQ, RabbitMQ, ZeroMQ etc..?
Thank you :)
回答1:
You can achieve this using Redis lists with an additional "dispatch" queue that all workers BRPOP
on for their jobs. Each job in the dispatch queue is tagged with the original queue ID, and when the worker has completed the job it goes to this original queue and performs RPOPLPUSH
onto the dispatch queue to make the next job available for any other worker. The dispatch queue will therefore have a maximum of num_queues elements.
One thing you'll have to handle is the initial population of the dispatch queue when the source queue is empty. This could just be a check done by the publisher against an "empty" flag for each queue that is set initially, and also set by the worker when there is nothing left in the original queue to dispatch. If this flag is set, the publisher can just LPUSH
the first job directly onto the dispatch queue.
来源:https://stackoverflow.com/questions/41979438/how-can-i-implement-this-single-concurrency-distributed-queue-in-any-mq-platform