How to implement distributed processing [closed]

问题

I have requests coming in for different samples (s1, s2, ..) that need to be processed in a linear fashion (i.e. only one say s1-request at the time can be processed). I have N-number of worker services that can process given requests. How can I implement rpc-queue pattern so that the samples are consumed one at the time and still allow distribution of calculation between different samples?

I would like to implement this with rabbit-mq because of it's simplicity, clustering capabilities, but I'm willing to consider other solutions as well.

Here is a picture to illustrate the problem ( with two workers)

                               worker 1 
                            +-----------+
                            |           |
 input queue          +---->|           |-------+
+--------------+      |     |           |       |
|              |      |     +-----------+       |
| s1,s2,s1,s1  |------+                         |
|              |      |        worker 2         |
+--------------+      |     +-----------+       |
                      |     |           |       |
 output queue         +---->|           |-------+
+--------------+            |           |       |
|              |            +-----------+       |
|(s1,s2,s1,s1) |<-+                             |
|              |  +-----------------------------+
+--------------+

回答1:

A trivial Task Queue processing, supposing tasks remain non-intervening:

A ZeroMQ has smart discussions for this and for a bit more complex setups >>>

Check a formal behaviour model setup for Divide & Conquer

(Fig-s: courtesy ZeroMQ/imatix)

at http://zguide.zeromq.org/page:all#Divide-and-Conquer

( Just for an inspiration, check also an extended approach with SIG_KILL add-on

)

n.b.: I have no ( rather Ø ) ZeroMQ affiliation, the same with imatix. However, after a lot of Projects, that work smart also due to this fabulous ZeroMQ-abstraction & architecture, IMHO I bet I can say, this is a horse-power one may only benefit from on high-performance, scale-able, low-latency, distributed systems.

回答2:

Hey have you checked out https://storm.incubator.apache.org its written in python I believe.

Iron.io can host your queues and distributed worker patterns to be executed on our platform in any language. IronWorker is also backed by a task queue that makes life pretty easy for you.

Hotel Tonight used the terminology ETL extact, translate, load for passing and transforming data through a pipeline.

http://engineering.hoteltonight.com/ruby-etl-with-ironworker-and-redshift

(I work for Iron.io just wanted to put some resources out there)

来源：https://stackoverflow.com/questions/20838549/how-to-implement-distributed-processing

标签

RabbitMQ

message-queue

zeromq

distributed-computing