Memory-efficient filtering of `DataFrame` rows

后端 未结 2 672
时光取名叫无心
时光取名叫无心 2021-01-23 00:22

I have a large DataFrame object (1,440,000,000 rows). I operate at memory (swap includet) limit.

I need to extract a subset of the rows with certain value o

2条回答
  •  醉梦人生
    2021-01-23 00:57

    If by any change all the data in the DataFrame are of same types, use numpy array instead, it's more memory efficient and faster. You can convert your dataframe to numpy matrix by df.as_matrix().

    Also that you might wanna check how much memory the dataframe already takes by:

        import sys
        sys.getsizeof()
    

    that returns the size in bytes.

提交回复
热议问题