Efficient implmentation of Python multiprocesssing Pool

谁都会走 提交于 2020-05-17 06:22:07

问题


I have two codes. One is pooled (multiprocessing) version of the other. However, the parallel version with even 1 processor is taking a long time whereas the serial version finishes in ~15 sec. Can someone help to accelerate the second version.

  1. Serial
    import numpy as np, time
    def mapTo(d):   
        global tree
        for idx, item in enumerate(list(d), start=1):
            tree[str(item)].append(idx)

    data=np.random.randint(1,4, 20000000)
    tree = dict({"1":[],"2":[],"3":[]})
    s= time.perf_counter()
    mapTo(data)
    e = time.perf_counter()
    print("elapsed time:",e-s)

takes: ~15 sec

  1. Parallel
from multiprocessing import Manager, Pool
from functools import partial
import numpy as np
import time

def mapTo(i_d,tree):
    idx,item = i_d
    l = tree[str(item)]
    l.append(idx)
    tree[str(item)] = l

manager = Manager()
data    = np.random.randint(1,4, 20000000)
# sharedtree= manager.dict({"1":manager.list(),"2":manager.list(),"3":manager.list()})
sharedtree = manager.dict({"1":[],"2":[],"3":[]})
s= time.perf_counter()
with Pool(processes=1) as pool:
    pool.map(partial(mapTo, tree=sharedtree), list(enumerate(data,start=1)))
e = time.perf_counter()
print("elapsed time:",e-s)

来源:https://stackoverflow.com/questions/61593959/efficient-implmentation-of-python-multiprocesssing-pool

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!