I\'m trying to use multiprocessing
\'s Pool.map()
function to divide out work simultaneously. When I use the following code, it works fine:
There's another short-cut you can use, although it can be inefficient depending on what's in your class instances.
As everyone has said the problem is that the multiprocessing
code has to pickle the things that it sends to the sub-processes it has started, and the pickler doesn't do instance-methods.
However, instead of sending the instance-method, you can send the actual class instance, plus the name of the function to call, to an ordinary function that then uses getattr
to call the instance-method, thus creating the bound method in the Pool
subprocess. This is similar to defining a __call__
method except that you can call more than one member function.
Stealing @EricH.'s code from his answer and annotating it a bit (I retyped it hence all the name changes and such, for some reason this seemed easier than cut-and-paste :-) ) for illustration of all the magic:
import multiprocessing
import os
def call_it(instance, name, args=(), kwargs=None):
"indirect caller for instance methods and multiprocessing"
if kwargs is None:
kwargs = {}
return getattr(instance, name)(*args, **kwargs)
class Klass(object):
def __init__(self, nobj, workers=multiprocessing.cpu_count()):
print "Constructor (in pid=%d)..." % os.getpid()
self.count = 1
pool = multiprocessing.Pool(processes = workers)
async_results = [pool.apply_async(call_it,
args = (self, 'process_obj', (i,))) for i in range(nobj)]
pool.close()
map(multiprocessing.pool.ApplyResult.wait, async_results)
lst_results = [r.get() for r in async_results]
print lst_results
def __del__(self):
self.count -= 1
print "... Destructor (in pid=%d) count=%d" % (os.getpid(), self.count)
def process_obj(self, index):
print "object %d" % index
return "results"
Klass(nobj=8, workers=3)
The output shows that, indeed, the constructor is called once (in the original pid) and the destructor is called 9 times (once for each copy made = 2 or 3 times per pool-worker-process as needed, plus once in the original process). This is often OK, as in this case, since the default pickler makes a copy of the entire instance and (semi-) secretly re-populates it—in this case, doing:
obj = object.__new__(Klass)
obj.__dict__.update({'count':1})
—that's why even though the destructor is called eight times in the three worker processes, it counts down from 1 to 0 each time—but of course you can still get into trouble this way. If necessary, you can provide your own __setstate__
:
def __setstate__(self, adict):
self.count = adict['count']
in this case for instance.