问题
I want to fit many distributions with scipy and would like to use some sort of multiprocessing for this. Something like this:
import scipy.stats as ss
from pathos.multiprocessing import ProcessingPool
from multiprocessing import Pool
mp = Pool()
pp = ProcessingPool()
l = [0,1,2,3,4,6,7,8,9]
print map(ss.lognorm.fit,l) #method 0
print mp.map(ss.lognorm.fit,l) #method 1
print pp.map(ss.lognorm.fit,l) #method 2
Method 0 is of course not multiprocessing, but works. Method 1 and 2 both return with long tracebacks. Does anybody have a workaround for this?
Method 1 error:
Process PoolWorker-1:
Traceback (most recent call last):
File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/usr/lib/python2.7/multiprocessing/process.py", line 114, in run
self._target(*self._args, **self._kwargs)
File "/usr/lib/python2.7/multiprocessing/pool.py", line 102, in worker
Process PoolWorker-2:
task = get()
File "/usr/lib/python2.7/multiprocessing/queues.py", line 376, in get
Process PoolWorker-4:
return recv()
Traceback (most recent call last):
AttributeError: ("'lognorm_gen' object has no attribute '_parse_args'", <built-in function getattr>, (<scipy.stats._continuous_distns.lognorm_gen object at 0x7fb15349ddd0>, '_parse_args'))
File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/usr/lib/python2.7/multiprocessing/process.py", line 114, in run
self._target(*self._args, **self._kwargs)
File "/usr/lib/python2.7/multiprocessing/pool.py", line 102, in worker
task = get()
File "/usr/lib/python2.7/multiprocessing/queues.py", line 376, in get
return recv()
AttributeError: ("'lognorm_gen' object has no attribute '_parse_args'", <built-in function getattr>, (<scipy.stats._continuous_distns.lognorm_gen object at 0x7fb15349ddd0>, '_parse_args'))
Traceback (most recent call last):
File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/usr/lib/python2.7/multiprocessing/process.py", line 114, in run
self._target(*self._args, **self._kwargs)
File "/usr/lib/python2.7/multiprocessing/pool.py", line 102, in worker
task = get()
File "/usr/lib/python2.7/multiprocessing/queues.py", line 376, in get
return recv()
AttributeError: ("'lognorm_gen' object has no attribute '_parse_args'", <built-in function getattr>, (<scipy.stats._continuous_distns.lognorm_gen object at 0x7fb15349ddd0>, '_parse_args'))
Process PoolWorker-3:
Traceback (most recent call last):
File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/usr/lib/python2.7/multiprocessing/process.py", line 114, in run
self._target(*self._args, **self._kwargs)
File "/usr/lib/python2.7/multiprocessing/pool.py", line 102, in worker
task = get()
File "/usr/lib/python2.7/multiprocessing/queues.py", line 376, in get
return recv()
AttributeError: ("'lognorm_gen' object has no attribute '_parse_args'", <built-in function getattr>, (<scipy.stats._continuous_distns.lognorm_gen object at 0x7fb15349ddd0>, '_parse_args'))
Process PoolWorker-5:
Traceback (most recent call last):
File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/usr/lib/python2.7/multiprocessing/process.py", line 114, in run
self._target(*self._args, **self._kwargs)
File "/usr/lib/python2.7/multiprocessing/pool.py", line 102, in worker
task = get()
File "/usr/lib/python2.7/multiprocessing/queues.py", line 376, in get
return recv()
AttributeError: ("'lognorm_gen' object has no attribute '_parse_args'", <built-in function getattr>, (<scipy.stats._continuous_distns.lognorm_gen object at 0x7fb15349ddd0>, '_parse_args'))
Process PoolWorker-6:
Traceback (most recent call last):
File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/usr/lib/python2.7/multiprocessing/process.py", line 114, in run
self._target(*self._args, **self._kwargs)
File "/usr/lib/python2.7/multiprocessing/pool.py", line 102, in worker
task = get()
File "/usr/lib/python2.7/multiprocessing/queues.py", line 376, in get
return recv()
AttributeError: ("'lognorm_gen' object has no attribute '_parse_args'", <built-in function getattr>, (<scipy.stats._continuous_distns.lognorm_gen object at 0x7fb15349ddd0>, '_parse_args'))
Process PoolWorker-7:
Traceback (most recent call last):
File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/usr/lib/python2.7/multiprocessing/process.py", line 114, in run
self._target(*self._args, **self._kwargs)
File "/usr/lib/python2.7/multiprocessing/pool.py", line 102, in worker
task = get()
File "/usr/lib/python2.7/multiprocessing/queues.py", line 376, in get
return recv()
AttributeError: ("'lognorm_gen' object has no attribute '_parse_args'", <built-in function getattr>, (<scipy.stats._continuous_distns.lognorm_gen object at 0x7fb15349ddd0>, '_parse_args'))
Process PoolWorker-8:
Traceback (most recent call last):
File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/usr/lib/python2.7/multiprocessing/process.py", line 114, in run
self._target(*self._args, **self._kwargs)
File "/usr/lib/python2.7/multiprocessing/pool.py", line 102, in worker
task = get()
File "/usr/lib/python2.7/multiprocessing/queues.py", line 376, in get
return recv()
AttributeError: ("'lognorm_gen' object has no attribute '_parse_args'", <built-in function getattr>, (<scipy.stats._continuous_distns.lognorm_gen object at 0x7fb15349ddd0>, '_parse_args'))
Process PoolWorker-9:
Traceback (most recent call last):
File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/usr/lib/python2.7/multiprocessing/process.py", line 114, in run
self._target(*self._args, **self._kwargs)
File "/usr/lib/python2.7/multiprocessing/pool.py", line 102, in worker
task = get()
File "/usr/lib/python2.7/multiprocessing/queues.py", line 376, in get
return recv()
AttributeError: ("'lognorm_gen' object has no attribute '_parse_args'", <built-in function getattr>, (<scipy.stats._continuous_distns.lognorm_gen object at 0x7fb15349ddd0>, '_parse_args'))
Method 2 error:
Exception in thread Thread-4:
Traceback (most recent call last):
File "/usr/lib/python2.7/threading.py", line 810, in __bootstrap_inner
self.run()
File "/usr/lib/python2.7/threading.py", line 763, in run
self.__target(*self.__args, **self.__kwargs)
File "/usr/local/lib/python2.7/dist-packages/processing/pool.py", line 207, in _handleTasks
put(task)
File "/usr/local/lib/python2.7/dist-packages/dill-0.2.2-py2.7.egg/dill/dill.py", line 192, in dumps
dump(obj, file, protocol, byref, fmode)#, strictio)
File "/usr/local/lib/python2.7/dist-packages/dill-0.2.2-py2.7.egg/dill/dill.py", line 182, in dump
pik.dump(obj)
File "/usr/lib/python2.7/pickle.py", line 224, in dump
self.save(obj)
File "/usr/lib/python2.7/pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self
File "/usr/lib/python2.7/pickle.py", line 562, in save_tuple
save(element)
File "/usr/lib/python2.7/pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self
File "/usr/lib/python2.7/pickle.py", line 548, in save_tuple
save(element)
File "/usr/lib/python2.7/pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self
File "/usr/lib/python2.7/pickle.py", line 548, in save_tuple
save(element)
File "/usr/lib/python2.7/pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self
File "/usr/local/lib/python2.7/dist-packages/dill-0.2.2-py2.7.egg/dill/dill.py", line 626, in save_function
obj.__dict__), obj=obj)
File "/usr/lib/python2.7/pickle.py", line 401, in save_reduce
save(args)
File "/usr/lib/python2.7/pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self
File "/usr/lib/python2.7/pickle.py", line 562, in save_tuple
save(element)
File "/usr/lib/python2.7/pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self
File "/usr/lib/python2.7/pickle.py", line 548, in save_tuple
save(element)
File "/usr/lib/python2.7/pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self
File "/usr/local/lib/python2.7/dist-packages/dill-0.2.2-py2.7.egg/dill/dill.py", line 826, in save_cell
pickler.save_reduce(_create_cell, (obj.cell_contents,), obj=obj)
File "/usr/lib/python2.7/pickle.py", line 401, in save_reduce
save(args)
File "/usr/lib/python2.7/pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self
File "/usr/lib/python2.7/pickle.py", line 548, in save_tuple
save(element)
File "/usr/lib/python2.7/pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self
File "/usr/local/lib/python2.7/dist-packages/dill-0.2.2-py2.7.egg/dill/dill.py", line 794, in save_instancemethod0
obj.im_class), obj=obj)
File "/usr/lib/python2.7/pickle.py", line 401, in save_reduce
save(args)
File "/usr/lib/python2.7/pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self
File "/usr/lib/python2.7/pickle.py", line 548, in save_tuple
save(element)
File "/usr/lib/python2.7/pickle.py", line 331, in save
self.save_reduce(obj=obj, *rv)
File "/usr/lib/python2.7/pickle.py", line 419, in save_reduce
save(state)
File "/usr/lib/python2.7/pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self
File "/usr/local/lib/python2.7/dist-packages/dill-0.2.2-py2.7.egg/dill/dill.py", line 658, in save_module_dict
StockPickler.save_dict(pickler, obj)
File "/usr/lib/python2.7/pickle.py", line 649, in save_dict
self._batch_setitems(obj.iteritems())
File "/usr/lib/python2.7/pickle.py", line 681, in _batch_setitems
save(v)
File "/usr/lib/python2.7/pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self
File "/usr/local/lib/python2.7/dist-packages/dill-0.2.2-py2.7.egg/dill/dill.py", line 794, in save_instancemethod0
obj.im_class), obj=obj)
File "/usr/lib/python2.7/pickle.py", line 401, in save_reduce
save(args)
File "/usr/lib/python2.7/pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self
File "/usr/lib/python2.7/pickle.py", line 548, in save_tuple
save(element)
File "/usr/lib/python2.7/pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self
File "/usr/local/lib/python2.7/dist-packages/dill-0.2.2-py2.7.egg/dill/dill.py", line 615, in save_function
if not _locate_function(obj): #, pickler._session):
File "/usr/local/lib/python2.7/dist-packages/dill-0.2.2-py2.7.egg/dill/dill.py", line 604, in _locate_function
found = _import_module(obj.__module__ + '.' + obj.__name__, safe=True)
TypeError: unsupported operand type(s) for +: 'NoneType' and 'str'
回答1:
Method 1 doesn't work because you can't pickle bound instance methods with pickle
. Method 2 doesn't work because scipy.stats
is doing something "tricky"… something that the dill
and pathos
author (me) doesn't quite know what it is without first investigating.
You can see the issue is not that scipy.stats
is using a bound method (not a problem for dill
or pathos
), but it's doing some renaming magic… which is why you when you look in the traceback from your pathos
call, you see _locate_function
failing (it fails and finds None
)… and this is actually why Method 2 doesn't work.
>>> import scipy.stats as ss
>>>
>>> ss.lognorm
<scipy.stats._continuous_distns.lognorm_gen object at 0x10932d6d0>
The workaround is simple. Let the method be found easier by making a function that knows where it is.
>>> import pathos.multiprocessing as mp
>>> p = mp.ProcessingPool()
>>>
>>> def doit(x):
... return ss.lognorm.fit(x)
...
>>> p.map(doit, range(5))
[(1.0, 0.0, 1.0), (1.0, 1.0, 1.0), (1.0, 2.0, 1.0), (1.0, 3.0, 1.0), (1.0, 4.0, 1.0)]
来源:https://stackoverflow.com/questions/27994321/python-multiprocessing-scipy-stats-lognorm-fit