How can I parallelize method calls on an array of objects?

问题

I have a simulation that consists of a list of objects. I'd like to call a method on all of those objects in parallel, since none of them depends on the other, using a thread pool. You can't pickle a method, so I was thinking of using a wrapper function with a side effect to do something like the following:

from multiprocessing import Pool

class subcl:
    def __init__(self):
        self.counter=1
        return
    def increment(self):
        self.counter+=1
        return

def wrapper(targ):
    targ.increment()
    return

class sim:
    def __init__(self):
        self.world=[subcl(),subcl(),subcl(),subcl()]
    def run(self):
        if __name__=='__main__':
            p=Pool()
            p.map(wrapper,self.world)

a=sim()
a.run()
print a.world[1].counter #should be 2

However, the function call doesn't have the intended side effect on the actual objects in the array. Is there a way to handle this simply with a thread pool and map, or do I have to do everything in terms of raw function calls and tuples/lists/dicts (or get more elaborate with multiprocessing or some other parallelism library)?

回答1:

The main source of confusion is that multiprocessing uses separate processes and not threads. This means that any changes to object state made by the children aren't automatically visible to the parent.

The easiest way to handle this in your example is to have wrapper return the new value, and then use the return value of Pool.map:

from multiprocessing import Pool

class subcl:
    def __init__(self):
        self.counter=1
        return
    def increment(self):
        self.counter+=1
        return

def wrapper(targ):
    targ.increment()
    return targ                                        # <<<<< change #1

class sim:
    def __init__(self):
        self.world=[subcl(),subcl(),subcl(),subcl()]
    def run(self):
        if __name__=='__main__':
            p=Pool()
            self.world = p.map(wrapper,self.world)     # <<<<< change #2

a=sim()
a.run()
print a.world[1].counter # now prints 2

来源：https://stackoverflow.com/questions/6455465/how-can-i-parallelize-method-calls-on-an-array-of-objects

标签

python

multiprocessing