In the Python multiprocessing
library, is there a variant of pool.map
which supports multiple arguments?
text = "test"
def
Having learnt about itertools in J.F. Sebastian answer I decided to take it a step further and write a parmap
package that takes care about parallelization, offering map
and starmap
functions on python-2.7 and python-3.2 (and later also) that can take any number of positional arguments.
Installation
pip install parmap
How to parallelize:
import parmap
# If you want to do:
y = [myfunction(x, argument1, argument2) for x in mylist]
# In parallel:
y = parmap.map(myfunction, mylist, argument1, argument2)
# If you want to do:
z = [myfunction(x, y, argument1, argument2) for (x,y) in mylist]
# In parallel:
z = parmap.starmap(myfunction, mylist, argument1, argument2)
# If you want to do:
listx = [1, 2, 3, 4, 5, 6]
listy = [2, 3, 4, 5, 6, 7]
param = 3.14
param2 = 42
listz = []
for (x, y) in zip(listx, listy):
listz.append(myfunction(x, y, param1, param2))
# In parallel:
listz = parmap.starmap(myfunction, zip(listx, listy), param1, param2)
I have uploaded parmap to PyPI and to a github repository.
As an example, the question can be answered as follows:
import parmap
def harvester(case, text):
X = case[0]
text+ str(X)
if __name__ == "__main__":
case = RAW_DATASET # assuming this is an iterable
parmap.map(harvester, case, "test", chunksize=1)