I\'m trying to use multiprocessing
\'s Pool.map()
function to divide out work simultaneously. When I use the following code, it works fine:
All of these solutions are ugly because multiprocessing and pickling is broken and limited unless you jump outside the standard library.
If you use a fork of multiprocessing
called pathos.multiprocesssing
, you can directly use classes and class methods in multiprocessing's map
functions. This is because dill
is used instead of pickle
or cPickle
, and dill
can serialize almost anything in python.
pathos.multiprocessing
also provides an asynchronous map function… and it can map
functions with multiple arguments (e.g. map(math.pow, [1,2,3], [4,5,6])
)
See: What can multiprocessing and dill do together?
and: http://matthewrocklin.com/blog/work/2013/12/05/Parallelism-and-Serialization/
>>> import pathos.pools as pp
>>> p = pp.ProcessPool(4)
>>>
>>> def add(x,y):
... return x+y
...
>>> x = [0,1,2,3]
>>> y = [4,5,6,7]
>>>
>>> p.map(add, x, y)
[4, 6, 8, 10]
>>>
>>> class Test(object):
... def plus(self, x, y):
... return x+y
...
>>> t = Test()
>>>
>>> p.map(Test.plus, [t]*4, x, y)
[4, 6, 8, 10]
>>>
>>> p.map(t.plus, x, y)
[4, 6, 8, 10]
And just to be explicit, you can do exactly want you wanted to do in the first place, and you can do it from the interpreter, if you wanted to.
>>> import pathos.pools as pp
>>> class someClass(object):
... def __init__(self):
... pass
... def f(self, x):
... return x*x
... def go(self):
... pool = pp.ProcessPool(4)
... print pool.map(self.f, range(10))
...
>>> sc = someClass()
>>> sc.go()
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
>>>
Get the code here: https://github.com/uqfoundation/pathos