i have a long script which at the end needs to run a function to all items of huge list which takes a long time,consider for example:
input_a= [1,2,3,4] # a
from multiprocessing import Pool as ThreadPool
import requests
API_URL = 'http://example.com/api'
pool = ThreadPool(4) # Hint...
def foo(x):
params={'x': x}
r = requests.get(API_URL, params=params)
return r.json()
if __name__ == '__main__':
num_iter = [1,2,3,4,5]
out = pool.map(foo, num_iter)
print(out)
Hint's Answer: This is why the exception is raised. Pool definition is outside if __name__ == '__main__'
Fixed...
from multiprocessing import Pool as ThreadPool
import requests
API_URL = 'http://example.com/api'
def foo(x):
params={'x': x}
r = requests.get(API_URL, params=params)
return r.json()
if __name__ == '__main__':
pool = ThreadPool(4) # Hint...
num_iter = [1,2,3,4,5]
out = pool.map(foo, num_iter)
print(out)
The python docs touch on this scenario as well: https://docs.python.org/2/library/multiprocessing.html#using-a-pool-of-workers
I did not find this to be an issue when using multiprocessing.dummy at all.
Multiprocessing needs to be able to import your module, as stated at the top of the documentation.
You have a bunch of code sitting at module (global) scope, so this will be run every time the module is imported.
Put it within your if __name__ == '__main__'
block, or better yet, in a function.