问题
I have a function which I want to map to a list in parallel. I achieved the task by using multiprocess
module. However, when I want to run this task with multiprocessing
module, it seems that the processes has spawn but none of them start working.
The function and configuration of multiprocess
module is as follow:
import pandas as pd
my_file = pd.read_csv('my_file.txt')
from datetime import date, timedelta
start_dates = []
for i in range(17):
today = date.today() - timedelta(days = i)
start_dates.append(today)
start_dates = pd.DataFrame(start_dates, columns = ['Date'])
start_dates = start_dates.merge(my_file)
start_dates = start_dates.values.tolist()
from multiprocess import Pool
pool = Pool(processes = 8)
my_func(valu):
import pandas as pd
df = read_csv('df.txt')
df = df[df.Date > vale]
#lots of other computations
return final_df
result = pool.map(my_func, [st for st in start_dates])
pool.close()
results = pd.concat(result)
And the function and configuration of the multiprocessing
module is as below:
my_func(valu):
import pandas as pd
df = read_csv('df.txt')
df = df[df.Date > vale]
#lots of other computations
return final_df
from multiprocessing import Pool
import pandas as pd
my_file = pd.read_csv('my_file.txt')
from datetime import date, timedelta
start_dates = []
for i in range(17):
today = date.today() - timedelta(days = i)
start_dates.append(today)
start_dates = pd.DataFrame(start_dates, columns = ['Date'])
start_dates = start_dates.merge(my_file)
start_dates = start_dates.values.tolist()
def mp():
pool = Pool(processes = 8)
result = pool.map(my_func, [st for st in start_dates])
return result
if __name__ == '__main__':
mp()
results = pd.concat(result)
Is the problem relates to the order of the multiprocessing
configuration which I wrote?
Any help is appreciated.
来源:https://stackoverflow.com/questions/61564223/multiprocess-module-vs-multiprocessing-module