问题
I need to do some stuffs in multiprocess with Python 3.6. Namely, I have to update a dict adding lists of objects. Since these objects are unpickable I need to use dill
instead of pickle
and multiprocess
from pathos
instead of multiprocessing
, but this should not be the problem.
Adding a list to the dictionary needs to reserialize the list before of adding to the dictionary. This slow down everything and it takes the same time as without multiprocessing. Could you suggest me a workaround?
This is my code with python 3.6:
init1
is working but slow, init2
is fast but broken. The remaining is only for test purpose.
import time
def init1(d: dict):
for i in range(1000):
l = []
for k in range(i):
l.append(k)
d[i] = l
def init2(d: dict):
for i in range(1000):
l = []
d[i] = l
for k in range(i):
l.append(i)
def test1():
import multiprocess as mp
with mp.Manager() as manager:
d = manager.dict()
p = mp.Process(target=init1, args=(d,))
p.start()
p.join()
print(d)
def test2():
import multiprocess as mp
with mp.Manager() as manager:
d = manager.dict()
p = mp.Process(target=init2, args=(d,))
p.start()
p.join()
print(d)
start = time.time()
test1()
end = time.time()
print('test1: ', end - start)
start = time.time()
test2()
end = time.time()
print('test2: ', end - start)
回答1:
Possible solution using pipes. On my pc this takes 870ms, compared to 1.10s of test1
and 200ms of test2
.
def init3(child_conn):
d = {}
for i in range(1000):
l = []
for k in range(i):
l.append(i)
d[i] = l
child_conn.send(d)
def test3():
import multiprocess as mp
parent_conn, child_conn = mp.Pipe(duplex=False)
p = mp.Process(target=init3, args=(child_conn,))
p.start()
d = parent_conn.recv()
p.join()
On jupyter, by using magic %timeit
I get:
In [01]: %timeit test3()
872 ms ± 11.9 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
In [02]: %timeit test2()
199 ms ± 1.72 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
In [03]: %timeit test1()
1.09 s ± 10.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
来源:https://stackoverflow.com/questions/48720252/python-multiprocess-dict-of-list