Proper way to use multiprocessor.Pool in a nested loop

后端 未结 3 1915
礼貌的吻别
礼貌的吻别 2020-12-05 10:40

I am using the multiprocessor.Pool() module to speed up an \"embarrassingly parallel\" loop. I actually have a nested loop, and am using multiprocessor.Pool to speed up the

相关标签:
3条回答
  • 2020-12-05 11:09

    Ideally, you should call the Pool() constructor exactly once - not over & over again. There are substantial overheads when creating worker processes, and you pay those costs every time you invoke Pool(). The processes created by a single Pool() call stay around! When they finish the work you've given to them in one part of the program, they stick around, waiting for more work to do.

    As to Pool.close(), you should call that when - and only when - you're never going to submit more work to the Pool instance. So Pool.close() is typically called when the parallelizable part of your main program is finished. Then the worker processes will terminate when all work already assigned has completed.

    It's also excellent practice to call Pool.join() to wait for the worker processes to terminate. Among other reasons, there's often no good way to report exceptions in parallelized code (exceptions occur in a context only vaguely related to what your main program is doing), and Pool.join() provides a synchronization point that can report some exceptions that occurred in worker processes that you'd otherwise never see.

    Have fun :-)

    0 讨论(0)
  • 2020-12-05 11:14
    import itertools
    import multiprocessing as mp
    
    def job(params):
        a = params[0]
        b = params[1]
        return a*b
    
    def multicore():
        a = range(1000)
        b = range(2000)
        paramlist = list(itertools.product(a,b))
        print(paramlist[0])
        pool = mp.Pool(processes = 4)
        res=pool.map(job, paramlist)
        for i in res:
            print(i)
    
    if __name__=='__main__':
        multicore()
    

    how about this?

    0 讨论(0)
  • 2020-12-05 11:18
    import time
    from pathos.parallel import stats
    from pathos.parallel import ParallelPool as Pool
    
    
    def work(x, y):
        return x * y
    
    
    pool = Pool(5)
    pool.ncpus = 4
    pool.servers = ('localhost:5654',)
    t1 = time.time()
    results = pool.imap(work, range(1, 2), range(1, 11))
    print("INFO: List is: %s" % list(results))
    print(stats())
    t2 = time.time()
    print("TIMER: Function completed time is: %.5f" % (t2 - t1))
    
    0 讨论(0)
提交回复
热议问题