How to input large data into python pandas using looping or parallel computing?

后端 未结 5 668
谎友^
谎友^ 2021-02-13 21:46

I have a csv file of 8gb and I am not able to run the code as it shows memory error.

file = \"./data.csv\"
df = pd.read_csv(file, sep=\"/\", header=0, dtype=str         


        
5条回答
  •  忘掉有多难
    2021-02-13 22:33

    If you don't need all columns you may also use usecols parameter:

    https://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html

    usecols : array-like or callable, default None
    
    Return a subset of the columns. [...] 
    Using this parameter results in much faster parsing time and lower memory usage.
    

提交回复
热议问题