How to share pandas DataFrame object between processes?

前端未结

关注

 1  671

This question has the same point of the link that I posted before.

( Is there a good way to avoid memory deep copy or to reduce time spent in multiprocessing

相关标签:

1条回答

[愿得一人]

2021-01-02 15:53

You can use a Namespace Manager, the following code works as you expect.

#-*- coding: UTF-8 -*-' import pandas as pd import numpy as np from multiprocessing import * import multiprocessing.sharedctypes as sharedctypes import ctypes def add_new_derived_column(ns): dataframe2 = ns.df dataframe2['new_column']=dataframe2['A']+dataframe2['B'] / 2 print (dataframe2.head()) ns.df = dataframe2 if __name__ == "__main__": mgr = Manager() ns = mgr.Namespace() dataframe = pd.DataFrame(np.random.randn(100000, 2), columns=['A', 'B']) ns.df = dataframe print (dataframe.head()) # then I pass the "shared_df_obj" to Mulitiprocessing.Process object process=Process(target=add_new_derived_column, args=(ns,)) process.start() process.join() print (ns.df.head())

0 讨论(0)

发布评论:

提交评论

加载中...

验证码

看不清?

提交回复