Restrictions on dynamically created functions with concurrent.futures.ProcessPoolExecutor

早过忘川 提交于 2021-02-09 07:12:25

问题


I am trying to do some multiprocessing with functions that I dynamically create within other functions. It seems I can run these if the function fed to ProcessPoolExecutor is module-level:

def make_func(a):
    def dynamic_func(i):
        return i, i**2 + a
    return dynamic_func

f_dyns = [make_func(a) for a in range(10)]
def loopfunc(i):
    return f_dyns[i](i)

with concurrent.futures.ProcessPoolExecutor(3) as executor:
    for i,r in executor.map(loopfunc, range(10)):
        print(i,":",r)

output:

0 : 0
1 : 2
2 : 6
3 : 12
4 : 20
5 : 30
6 : 42
7 : 56
8 : 72
9 : 90

However I can't do it if the multiprocessing is launched by a class function:

class Test:
    def __init__(self,myfunc):
        self.f = myfunc

    def loopfunc(self,i):
        return self.f(i)

    def run(self):
        with concurrent.futures.ProcessPoolExecutor(3) as executor:
            for i,r in executor.map(self.loopfunc, range(10)):
                print(i,":",r)

o2 = Test(make_func(1))
o2.run()

output:

Traceback (most recent call last):
  File "/home/farmer/anaconda3/envs/general/lib/python3.6/multiprocessing/queues.py", line 234, in _feed
    obj = _ForkingPickler.dumps(obj)
  File "/home/farmer/anaconda3/envs/general/lib/python3.6/multiprocessing/reduction.py", line 51, in dumps
    cls(buf, protocol).dump(obj)
AttributeError: Can't pickle local object 'make_func.<locals>.dynamic_func'

On the other hand, I can run the multiprocessing on a class function just fine if I don't use a dynamically generated function in there. Is there some way around this? I tried adding the dynamically generated function to the 'globals' dictionary but that didn't seem to help:

def make_func_glob(a):
    def dynamic_func(i):
        return i, i**2 + a
    globals()['my_func_{0}'.format(a)] = dynamic_func

make_func_glob(1)
print("test:", my_func_1(3))
o3 = Test(my_func_1)
o3.run()

output:

test: (3, 10)
Traceback (most recent call last):
  File "/home/farmer/anaconda3/envs/general/lib/python3.6/multiprocessing/queues.py", line 234, in _feed
    obj = _ForkingPickler.dumps(obj)
  File "/home/farmer/anaconda3/envs/general/lib/python3.6/multiprocessing/reduction.py", line 51, in dumps
    cls(buf, protocol).dump(obj)
AttributeError: Can't pickle local object 'make_func_glob.<locals>.dynamic_func'

So python still thinks it is a local object even though I added it to the 'globals' dict. Something like this 'globals' idea would be fine, I don't need anything fancy. I am only dynamically creating these functions for the sake of convenience. I would be perfectly happy for them to be global objects. They will always be defined by the module, there are just a bunch of them with almost the same definition so it is more convenient to define them programmatically rather than writing them all out manually. So I would have thought it was possible to somehow get python to recognise them as "true" functions, like if I defined them via 'exec'. Or at least close enough that I could use them in my parallelised code.


回答1:


As the error messages suggest, it's more to do with the pickling rather than dynamically generated functions. From https://docs.python.org/3/library/concurrent.futures.html#concurrent.futures.ProcessPoolExecutor

only picklable objects can be executed and returned.

and from https://docs.python.org/3/library/pickle.html#what-can-be-pickled-and-unpickled, the sorts of functions which can be pickled:

functions defined at the top level of a module (using def, not lambda)

which suggests other sorts of functions can't be pickled. From the code in the question a function that doesn't adhere to this leaps out: dynamic_func from...

def make_func(a):
    def dynamic_func(i):
        return i, i**2 + a
    return dynamic_func

...and you hint that this is the issue....

So I would have thought it was possible to somehow get python to recognise them as "true" functions

You can! You can put dynamic_func on the top level, and use partial rather than a closure...

from functools import partial
def dynamic_func(a, i):
    return i, i**2 + a

def make_func(a):
    return partial(dynamic_func, a)

So in full...

import concurrent.futures
from functools import partial

def dynamic_func(a, i):
    return i, i**2 + a

def make_func(a):
    return partial(dynamic_func, a)

f_dyns = [make_func(a) for a in range(10)]
def loopfunc(i):
    return f_dyns[i](i)


class Test:
    def __init__(self, myfunc):
        self.f = myfunc

    def loopfunc(self, i):
        return self.f(i)

    def run(self):
        with concurrent.futures.ProcessPoolExecutor(3) as executor:
            for i,r in executor.map(self.loopfunc, range(10)):
                print(i,":",r)

o2 = Test(make_func(1))
o2.run()

But... why the original form without classes worked, I don't know. By my understanding, it would be trying to pickle a non-top level function, and so I think my understanding is flawed.



来源:https://stackoverflow.com/questions/49899351/restrictions-on-dynamically-created-functions-with-concurrent-futures-processpoo

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!