passing pandas dataframe into a python subprocess.Popen as an argument

前端 未结 2 463
走了就别回头了
走了就别回头了 2021-01-19 19:22

I am attempting to call a python script from a master script. I need the dataframe to be generated only one from within the master script and then passed on to the subproces

相关标签:
2条回答
  • 2021-01-19 19:55

    Subprocess launches another application. The ways that processes may communicate between each other significantly differ from ways that functions communicate within python program. You need to pass your DataFrame through a non pythonic environment. So you need to serialize it in-to a text and then deserialize it on other end. For example you can use pickle module and then sp.communicate(pickle.dumps(test_dataframe)) on one end end pickle.loads(sys.stdin.read()) on another. Or you can write your DataFrame as csv and then parse it again. Or you can use any other format.

    0 讨论(0)
  • 2021-01-19 20:09

    Here is a complete example for Python 3.6 of two-way communication between the master script and a subprocess.

    master.py

    import pandas as pd
    import pickle
    import subprocess
    
    df = pd.read_excel(r'C:\test_location\file.xlsx',sheetname='Table')
    
    result = subprocess.run(['python', 'call_model.py'], input=pickle.dumps(df), stdout=subprocess.PIPE, stderr=subprocess.PIPE)
    returned_df = pickle.loads(result.stdout)
    assert df == returned_df
    

    If there is a problem, you can check result.stderr.

    subroutine.py

    import pickle
    import sys
    
    data = pickle.loads(sys.stdin.buffer.read())
    sys.stdout.buffer.write(pickle.dumps(data))
    
    0 讨论(0)
提交回复
热议问题