In the example code below, I\'d like to recover the return value of the function worker
. How can I go about doing this? Where is this value stored?
Thought I'd simplify the simplest examples copied from above, working for me on Py3.6. Simplest is multiprocessing.Pool:
import multiprocessing
import time
def worker(x):
time.sleep(1)
return x
pool = multiprocessing.Pool()
print(pool.map(worker, range(10)))
You can set the number of processes in the pool with, e.g., Pool(processes=5)
. However it defaults to CPU count, so leave it blank for CPU-bound tasks. (I/O-bound tasks often suit threads anyway, as the threads are mostly waiting so can share a CPU core.) Pool
also applies chunking optimization.
(Note that the worker method cannot be nested within a method. I initially defined my worker method inside the method that makes the call to pool.map
, to keep it all self-contained, but then the processes couldn't import it, and threw "AttributeError: Can't pickle local object outer_method..inner_method". More here. It can be inside a class.)
(Appreciate the original question specified printing 'represent!'
rather than time.sleep()
, but without it I thought some code was running concurrently when it wasn't.)
Py3's ProcessPoolExecutor is also two lines (.map
returns a generator so you need the list()
):
from concurrent.futures import ProcessPoolExecutor
with ProcessPoolExecutor() as executor:
print(list(executor.map(worker, range(10))))
With plain Processes:
import multiprocessing
import time
def worker(x, queue):
time.sleep(1)
queue.put(x)
queue = multiprocessing.SimpleQueue()
tasks = range(10)
for task in tasks:
multiprocessing.Process(target=worker, args=(task, queue,)).start()
for _ in tasks:
print(queue.get())
Use SimpleQueue if all you need is put
and get
. The first loop starts all the processes, before the second makes the blocking queue.get
calls. I don't think there's any reason to call p.join()
too.