How to input large data into python pandas using looping or parallel computing?

后端 未结 5 640
谎友^
谎友^ 2021-02-13 21:46

I have a csv file of 8gb and I am not able to run the code as it shows memory error.

file = \"./data.csv\"
df = pd.read_csv(file, sep=\"/\", header=0, dtype=str         


        
5条回答
  •  甜味超标
    2021-02-13 22:47

    You also might want to use the das framework and it's built in dask.dataframe. Essentially, the csv file is transformed into multiple pandas dataframes, each read in when necessary. However, not every pandas command is avaialble within dask.

提交回复
热议问题