Get a function pickleable for using in Differential Evolution workers = -1

后端未结

关注

 1  1343

旧巷少年郎 2021-01-16 13:59

#I EDITED MY ORIGINAL POST in order to put a simpler example. I use differential evolution (DE) of Scipy to optimize certain parameters. I would like to use all the PC proce

1条回答

攒了一身酷 (楼主)

2021-01-16 14:29

Two things:

I'm not able to reproduce the error you are seeing unless I first pickle/unpickle the custom function.
There's no need to pickle/unpickle the custom function before passing it to the solver.

This seems to work for me. Python 3.6.12 and scipy 1.5.2:

>>> from scipy.optimize import rosen, differential_evolution
>>> bounds = [(0,2), (0, 2)]
>>> 
>>> def Ros_custom(X):
...     x = X[0]
...     y = X[1]
...     a = 1. - x
...     b = y - x*x
...     return a*a + b*b*100
... 
>>> result = differential_evolution(Ros_custom, bounds, updating='deferred',workers=-1)
>>> result.x, result.fun
(array([1., 1.]), 0.0)
>>> 
>>> result
     fun: 0.0
 message: 'Optimization terminated successfully.'
    nfev: 4953
     nit: 164
 success: True
       x: array([1., 1.])
>>>

I can even nest a function inside of the custom objective:

>>> def foo(a,b):
...   return a*a + b*b*100
... 
>>> def custom(X):
...   x,y = X[0],X[1]
...   return foo(1.-x, y-x*x)
... 
>>> result = differential_evolution(custom, bounds, updating='deferred',workers=-1)
>>> result
     fun: 0.0
 message: 'Optimization terminated successfully.'
    nfev: 4593
     nit: 152
 success: True
       x: array([1., 1.])

So, for me, at least the code works as expected.

You should have no need to serialize/deserialize the function ahead of it's use in scipy. Yes, the function need to be picklable, but scipy will do that for you. Basically, what's happening under the covers is that your function will get serialized, passed to multiprocessing as a string, then distributed to the processors, then unpickled and used on the target processors.

Like this, for 4 sets on inputs, run one per processor:

>>> import multiprocessing as mp
>>> res = mp.Pool().map(custom, [(0,1), (1,2), (4,9), (3,4)])
>>> list(res)
[101.0, 100.0, 4909.0, 2504.0]
>>>

Older versions of multiprocessing had difficulty serializing functions defined in the interpreter, and often needed to have the code executed in a __main__ block. If you are on windows, this is still often the case... and you might also need to call mp.freeze_support(), depending on how the code in scipy is implemented.

I tend to like dill (I'm the author) because it can serialize a broader range of objects that pickle. However, as scipy uses multiprocessing, which uses pickle... I often choose to use mystic (I'm the author), which uses multiprocess (I'm the author), which uses dill. Very roughly, equivalent codes, but they all work with dill instead of pickle.

>>> from mystic.solvers import diffev2
>>> from pathos.pools import ProcessPool
>>> diffev2(custom, bounds, npop=40, ftol=1e-10, map=ProcessPool().map)
Optimization terminated successfully.
         Current function value: 0.000000
         Iterations: 42
         Function evaluations: 1720
array([1.00000394, 1.00000836])

With mystic, you get some additional nice features, like a monitor:

>>> from mystic.monitors import VerboseMonitor
>>> mon = VerboseMonitor(5,5)
>>> diffev2(custom, bounds, npop=40, ftol=1e-10, itermon=mon, map=ProcessPool().map)
Generation 0 has ChiSquare: 0.065448
Generation 0 has fit parameters:
 [0.769543181527466, 0.5810893880113548]
Generation 5 has ChiSquare: 0.065448
Generation 5 has fit parameters:
 [0.588156685059123, -0.08325052939774935]
Generation 10 has ChiSquare: 0.060129
Generation 10 has fit parameters:
 [0.8387858177101133, 0.6850849855634057]
Generation 15 has ChiSquare: 0.001492
Generation 15 has fit parameters:
 [1.0904350077743412, 1.2027007403275813]
Generation 20 has ChiSquare: 0.001469
Generation 20 has fit parameters:
 [0.9716429877952866, 0.9466681129902448]
Generation 25 has ChiSquare: 0.000114
Generation 25 has fit parameters:
 [0.9784047411865372, 0.9554056558210251]
Generation 30 has ChiSquare: 0.000000
Generation 30 has fit parameters:
 [0.996105436348129, 0.9934091068974504]
Generation 35 has ChiSquare: 0.000000
Generation 35 has fit parameters:
 [0.996589586891175, 0.9938925277204567]
Generation 40 has ChiSquare: 0.000000
Generation 40 has fit parameters:
 [1.0003791956048833, 1.0007133195321427]
Generation 45 has ChiSquare: 0.000000
Generation 45 has fit parameters:
 [1.0000170425596364, 1.0000396089375592]
Generation 50 has ChiSquare: 0.000000
Generation 50 has fit parameters:
 [0.9999013984263114, 0.9998041148375927]
STOP("VTRChangeOverGeneration with {'ftol': 1e-10, 'gtol': 1e-06, 'generations': 30, 'target': 0.0}")
Optimization terminated successfully.
         Current function value: 0.000000
         Iterations: 54
         Function evaluations: 2200
array([0.99999186, 0.99998338])
>>>

All of the above are running in parallel.

So, in summary, the code should work as is (and without pre-pickling) -- maybe unless you are on windows, where you might need to use freeze_support and run the code in the __main__ block.

0 讨论(0)