I\'m looking to run a long-running python analysis process on a few Amazon EC2 instances. The code already runs using the python multiprocessing
module and can
the docs give you a good setup for running multiprocessing on multiple machines. Using s3 is a good way to share files across ec2 instances, but with multiprocessing you can share queues and pass data.
if you can use hadoop for parallel tasks, it is a very good way to extract parallelism across machines, but if you need a lot of IPC then building your own solution with multiprocessing isn't that bad.
just make sure you put your machines in the same security groups :-)