How do I parallelize a simple Python loop?

后端 未结 13 1262
北荒
北荒 2020-11-22 11:54

This is probably a trivial question, but how do I parallelize the following loop in python?

# setup output lists
output1 = list()
output2 = list()
output3 =          


        
13条回答
  •  南笙
    南笙 (楼主)
    2020-11-22 12:33

    There are a number of advantages to using Ray:

    • You can parallelize over multiple machines in addition to multiple cores (with the same code).
    • Efficient handling of numerical data through shared memory (and zero-copy serialization).
    • High task throughput with distributed scheduling.
    • Fault tolerance.

    In your case, you could start Ray and define a remote function

    import ray
    
    ray.init()
    
    @ray.remote(num_return_vals=3)
    def calc_stuff(parameter=None):
        # Do something.
        return 1, 2, 3
    

    and then invoke it in parallel

    output1, output2, output3 = [], [], []
    
    # Launch the tasks.
    for j in range(10):
        id1, id2, id3 = calc_stuff.remote(parameter=j)
        output1.append(id1)
        output2.append(id2)
        output3.append(id3)
    
    # Block until the results have finished and get the results.
    output1 = ray.get(output1)
    output2 = ray.get(output2)
    output3 = ray.get(output3)
    

    To run the same example on a cluster, the only line that would change would be the call to ray.init(). The relevant documentation can be found here.

    Note that I'm helping to develop Ray.

提交回复
热议问题