问题
I have a simple code:
path = [filepath1, filepath2, filepath3]
def umap_embedding(filepath):
file = np.genfromtxt(filepath,delimiter=' ')
if len(file) > 20000:
file = file[np.random.choice(file.shape[0], 20000, replace=False), :]
neighbors = len(file)//200
if neighbors >= 2:
neighbors = neighbors
else:
neighbors = 2
embedder = umap.UMAP(n_neighbors=neighbors,
min_dist=0.1,
metric='correlation', n_components=2)
embedder.fit(file)
embedded = embedder.transform(file)
name = 'file'
np.savetxt(name,embedded,delimiter=",")
if __name__ == '__main__':
p = Pool(processes = 20)
start = time.time()
for filepath in path:
p.apply_async(umap_embedding, [filepath])
p.close()
p.join()
print("Complete")
end = time.time()
print('total time (s)= ' + str(end-start))
When I execute, the console return the error:
Traceback (most recent call last):
File "/home/cngc3/CBC/parallel.py", line 77, in <module>
p.apply_async(umap_embedding, [filepath])
File "/home/cngc3/anaconda3/envs/CBC/lib/python3.6/multiprocessing/pool.py", line 355, in apply_async
raise ValueError("Pool not running")
ValueError: Pool not running
I tried to find the solution for this problem on Stackoverflow and Google but there's no related problem. Thank you for your help.
回答1:
p.close()
and p.join()
must be placed after the for
-loop. Otherwise the pool is closed in the first iteration of the loop and doesn't accept new jobs in the second.
来源:https://stackoverflow.com/questions/52250054/python-valueerror-pool-not-running-in-async-multiprocessing