问题
I have a piece of code below where the joblib.Parallel()
returns a list.
import numpy as np
from joblib import Parallel, delayed
lst = [[0.0, 1, 2], [3, 4, 5], [6, 7, 8]]
arr = np.array(lst)
w, v = np.linalg.eigh(arr)
def proj_func(i):
return np.dot(v[:,i].reshape(-1, 1), v[:,i].reshape(1, -1))
proj = Parallel(n_jobs=-1)(delayed(proj_func)(i) for i in range(len(w)))
Instead of a list, how do I return a generator using joblib.Parallel()
?
Edit:
I have updated the code as suggested by @user3666197 in comments below.
import numpy as np
from joblib import Parallel, delayed
lst = [[0.0, 1, 2], [3, 4, 5], [6, 7, 8]]
arr = np.array(lst)
w, v = np.linalg.eigh(arr)
def proj_func(i):
yield np.dot(v[:,i].reshape(-1, 1), v[:,i].reshape(1, -1))
proj = Parallel(n_jobs=-1)(delayed(proj_func)(i) for i in range(len(w)))
But I am getting this error:
TypeError: can't pickle generator objects
Am I missing something? How do I fix this? My main gain here is to reduce memory as proj
can get very large, so I would just like to call each generator in the list one at a time.
回答1:
Q : "how do I return a generator using
joblib.Parallel
?"
Given the joblib
purpose and implementation, focused on distributing code-execution units, using a set of spawned, independent processes ( yes, motivated by a boosted performance from an escape from a central GIL-lock re-[SERIAL]
-ised dancing one-GIL-step-after-another-GIL-step-after-... ) made by the syntactic constructor known as joblib.Parallel(...)( delayed()(...) )
, my, obviously limited imagination, tells me, the maximum achievable is but to make the "remotely" executed processes to return back to main the requested generator(s) that are joblib
-assembled ( out of one's control ) into a list.
So an achievable maximum is to receive a list of generators, not any form of a deferred-execution, wrapped on return as a generator, given the above set of initial conditions and given the function fun()
, set to be injected via the delayed( fun )(...)
into the joblib.Parallel( n_jobs = ... )
-many "remote"-processes, will indeed do so.
A Bonus Part :
If we were indeed pedantic purists, the only chance to receive but "a ( one ) generator using
joblib.Parallel()
", for that to happen then_jobs
would need to be just== 1
, which lexically and logically will meet the defined goal --to return (but) a (one) generator--, yet would be less efficient and less meaningful, than throwing money into the river of Nile...
来源:https://stackoverflow.com/questions/60584543/how-to-return-a-generator-using-joblib-parallel