In the Python multiprocessing
library, is there a variant of pool.map
which supports multiple arguments?
text = "test"
def
Using Python 3.3+ with pool.starmap():
from multiprocessing.dummy import Pool as ThreadPool
def write(i, x):
print(i, "---", x)
a = ["1","2","3"]
b = ["4","5","6"]
pool = ThreadPool(2)
pool.starmap(write, zip(a,b))
pool.close()
pool.join()
Result:
1 --- 4
2 --- 5
3 --- 6
You can also zip() more arguments if you like: zip(a,b,c,d,e)
In case you want to have a constant value passed as an argument you have to use import itertools
and then zip(itertools.repeat(constant), a)
for example.
How to take multiple arguments:
def f1(args):
a, b, c = args[0] , args[1] , args[2]
return a+b+c
if __name__ == "__main__":
import multiprocessing
pool = multiprocessing.Pool(4)
result1 = pool.map(f1, [ [1,2,3] ])
print(result1)
Another simple alternative is to wrap your function parameters in a tuple and then wrap the parameters that should be passed in tuples as well. This is perhaps not ideal when dealing with large pieces of data. I believe it would make copies for each tuple.
from multiprocessing import Pool
def f((a,b,c,d)):
print a,b,c,d
return a + b + c +d
if __name__ == '__main__':
p = Pool(10)
data = [(i+0,i+1,i+2,i+3) for i in xrange(10)]
print(p.map(f, data))
p.close()
p.join()
Gives the output in some random order:
0 1 2 3
1 2 3 4
2 3 4 5
3 4 5 6
4 5 6 7
5 6 7 8
7 8 9 10
6 7 8 9
8 9 10 11
9 10 11 12
[6, 10, 14, 18, 22, 26, 30, 34, 38, 42]
From python 3.4.4, you can use multiprocessing.get_context() to obtain a context object to use multiple start methods:
import multiprocessing as mp
def foo(q, h, w):
q.put(h + ' ' + w)
print(h + ' ' + w)
if __name__ == '__main__':
ctx = mp.get_context('spawn')
q = ctx.Queue()
p = ctx.Process(target=foo, args=(q,'hello', 'world'))
p.start()
print(q.get())
p.join()
Or you just simply replace
pool.map(harvester(text,case),case, 1)
by:
pool.apply_async(harvester(text,case),case, 1)
A better solution for python2:
from multiprocessing import Pool
def func((i, (a, b))):
print i, a, b
return a + b
pool = Pool(3)
pool.map(func, [(0,(1,2)), (1,(2,3)), (2,(3, 4))])
2 3 4
1 2 3
0 1 2
out[]:
[3, 5, 7]
You can use the following two functions so as to avoid writing a wrapper for each new function:
import itertools
from multiprocessing import Pool
def universal_worker(input_pair):
function, args = input_pair
return function(*args)
def pool_args(function, *args):
return zip(itertools.repeat(function), zip(*args))
Use the function function
with the lists of arguments arg_0
, arg_1
and arg_2
as follows:
pool = Pool(n_core)
list_model = pool.map(universal_worker, pool_args(function, arg_0, arg_1, arg_2)
pool.close()
pool.join()