pandas multiprocessing apply

后端未结

关注

 8  1371

I\'m trying to use multiprocessing with pandas dataframe, that is split the dataframe to 8 parts. apply some function to each part using apply (with each part processed in d

相关标签:

8条回答

迷失自我

2020-11-28 06:37
Install Pyxtension that simplifies using parallel map and use like this:
```
from pyxtension.streams import stream

big_df = pd.concat(stream(np.array_split(df, multiprocessing.cpu_count())).mpmap(process))
```
0 讨论(0)
发布评论:

提交评论
- 加载中...

别跟我提以往

2020-11-28 06:38

This worked well for me:

rows_iter = (row for _, row in df.iterrows())

with multiprocessing.Pool() as pool:
    df['new_column'] = pool.map(process_apply, rows_iter)

0 讨论(0)

上一页 1 2