How to parallelize this piece of code?

帅比萌擦擦* 提交于 2020-01-07 05:58:09

问题


I've been browsing for some time but couldn't find any constructive answer that I could comprehend.

How should I paralellize the following code:

import random
import math
import numpy as np
import sys
import multiprocessing

boot = 20#number of iterations to be performed
def myscript(iteration_number):  
    #stuff that the code actually does


def main(unused_command_line_args):
    for i in xrange(boot):
        myscript(i)
    return 0

if __name__ == '__main__':
    sys.exit(main(sys.argv))

or where can I read about it? I'm not really sure how to search for it even.


回答1:


There's pretty much a natural progression from a for loop to parallel for a batch of embarrassingly parallel jobs.

>>> import multiprocess as mp
>>> # build a target function
>>> def doit(x):
...   return x**2 - 1
... 
>>> x = range(10)
>>> # the for loop
>>> y = []   
>>> for i in x:
...   y.append(doit(i))
... 
>>> y
[-1, 0, 3, 8, 15, 24, 35, 48, 63, 80]

So how to address this function in parallel?

>>> # convert the for loop to a map (still serial)
>>> y = map(doit, x)
>>> y
[-1, 0, 3, 8, 15, 24, 35, 48, 63, 80]
>>> 
>>> # build a worker pool for parallel tasks
>>> p = mp.Pool()
>>> # do blocking parallel
>>> y = p.map(doit, x)
>>> y
[-1, 0, 3, 8, 15, 24, 35, 48, 63, 80]
>>> 
>>> # use an iterator (non-blocking)
>>> y = p.imap(doit, x)
>>> y            
<multiprocess.pool.IMapIterator object at 0x10358d150>
>>> print list(y)
[-1, 0, 3, 8, 15, 24, 35, 48, 63, 80]
>>> # do asynchronous parallel
>>> y = p.map_async(doit, x)
>>> y
<multiprocess.pool.MapResult object at 0x10358d1d0>
>>> print y.get()
[-1, 0, 3, 8, 15, 24, 35, 48, 63, 80]
>>>
>>> # or if you like for loops, there's always this…
>>> y = p.imap_unordered(doit, x)
>>> z = []
>>> for i in iter(y):
...   z.append(i)
... 
>>> z
[-1, 0, 3, 8, 15, 24, 35, 48, 63, 80]

The last form is an unordered iterator, which tends to be the fastest… but you can't care about what order the results come back in -- they are unordered, and not guaranteed to return in the same order they were submitted.

Note also that I've used multiprocess (a fork) instead of multiprocessing… but purely because multiprocess is better when dealing with interactively defined functions. Otherwise the code above is the same for multiprocessing.



来源:https://stackoverflow.com/questions/31977702/how-to-parallelize-this-piece-of-code

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!