How do I read a large csv file with pandas?

后端 未结 15 1872
隐瞒了意图╮
隐瞒了意图╮ 2020-11-21 07:12

I am trying to read a large csv file (aprox. 6 GB) in pandas and i am getting a memory error:

MemoryError                               Traceback (most recen         


        
15条回答
  •  北荒
    北荒 (楼主)
    2020-11-21 07:37

    For large data l recommend you use the library "dask"
    e.g:

    # Dataframes implement the Pandas API
    import dask.dataframe as dd
    df = dd.read_csv('s3://.../2018-*-*.csv')
    

    You can read more from the documentation here.

    Another great alternative would be to use modin because all the functionality is identical to pandas yet it leverages on distributed dataframe libraries such as dask.

提交回复
热议问题