问题
I have written a nice parallel job processor that accepts jobs (functions, their arguments, timeout information etc.) and submits then to a Python multiprocessing pool. I can provide the full (long) code if requested, but the key step (as I see it) is the asynchronous application to the pool:
job.resultGetter = self.pool.apply_async(
func = job.workFunction,
kwds = job.workFunctionKeywordArguments
)
I am trying to use this parallel job processor with a large body of legacy code and, perhaps naturally, have run into pickling problems:
PicklingError: Can’t pickle <type ’instancemethod’>: attribute lookup builtin .instancemethod failed
This type of problem is observable when I try to submit a problematic object as an argument for a work function. The real problem is that this is legacy code and I am advised that I can make only very minor changes to it. So... is there some clever trick or simple modification I can make somewhere that could allow my parallel job processor code to cope with these traditionally unpicklable objects? I have total control over the parallel job processor code, so I am open to, say, wrapping every submitted function in another function. For the legacy code, I should be able to add the occasional small method to objects, but that's about it. Is there some clever approach to this type of problem?
回答1:
use dill
and pathos.multiprocessing
instead of pickle
and multiprocessing
.
see here: What can multiprocessing and dill do together?
http://matthewrocklin.com/blog/work/2013/12/05/Parallelism-and-Serialization/
How to pickle functions/classes defined in __main__ (python)
来源:https://stackoverflow.com/questions/21375509/in-a-pickle-how-to-serialise-legacy-objects-for-submission-to-a-python-multipro