How to return a generator using joblib.Parallel()?

吃可爱长大的小学妹 提交于 2020-03-21 10:47:07

问题


I have a piece of code below where the joblib.Parallel() returns a list.

import numpy as np
from joblib import Parallel, delayed

lst = [[0.0, 1, 2], [3, 4, 5], [6, 7, 8]]
arr = np.array(lst)
w, v = np.linalg.eigh(arr)

def proj_func(i):
    return np.dot(v[:,i].reshape(-1, 1), v[:,i].reshape(1, -1))

proj = Parallel(n_jobs=-1)(delayed(proj_func)(i) for i in range(len(w)))

Instead of a list, how do I return a generator using joblib.Parallel()?

Edit:

I have updated the code as suggested by @user3666197 in comments below.

import numpy as np
from joblib import Parallel, delayed

lst = [[0.0, 1, 2], [3, 4, 5], [6, 7, 8]]
arr = np.array(lst)
w, v = np.linalg.eigh(arr)

def proj_func(i):
    yield np.dot(v[:,i].reshape(-1, 1), v[:,i].reshape(1, -1))

proj = Parallel(n_jobs=-1)(delayed(proj_func)(i) for i in range(len(w)))

But I am getting this error:

TypeError: can't pickle generator objects

Am I missing something? How do I fix this? My main gain here is to reduce memory as proj can get very large, so I would just like to call each generator in the list one at a time.


回答1:


Q : "how do I return a generator using joblib.Parallel?"

Given the joblib purpose and implementation, focused on distributing code-execution units, using a set of spawned, independent processes ( yes, motivated by a boosted performance from an escape from a central GIL-lock re-[SERIAL]-ised dancing one-GIL-step-after-another-GIL-step-after-... ) made by the syntactic constructor known as joblib.Parallel(...)( delayed()(...) ), my, obviously limited imagination, tells me, the maximum achievable is but to make the "remotely" executed processes to return back to main the requested generator(s) that are joblib-assembled ( out of one's control ) into a list.

So an achievable maximum is to receive a list of generators, not any form of a deferred-execution, wrapped on return as a generator, given the above set of initial conditions and given the function fun(), set to be injected via the delayed( fun )(...) into the joblib.Parallel( n_jobs = ... )-many "remote"-processes, will indeed do so.


A Bonus Part :

If we were indeed pedantic purists, the only chance to receive but "a ( one ) generator using joblib.Parallel()", for that to happen the n_jobs would need to be just == 1, which lexically and logically will meet the defined goal --to return (but) a (one) generator--, yet would be less efficient and less meaningful, than throwing money into the river of Nile...



来源:https://stackoverflow.com/questions/60584543/how-to-return-a-generator-using-joblib-parallel

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!