发表新帖

发表新帖

Memory-efficient filtering of `DataFrame` rows

后端未结

关注

 2  683

时光取名叫无心

I have a large DataFrame object (1,440,000,000 rows). I operate at memory (swap includet) limit.

I need to extract a subset of the rows with certain value o

相关标签:

2条回答

醉梦人生

2021-01-23 00:57
If by any change all the data in the DataFrame are of same types, use numpy array instead, it's more memory efficient and faster. You can convert your dataframe to numpy matrix by df.as_matrix().

Also that you might wanna check how much memory the dataframe already takes by:
```
    import sys
    sys.getsizeof()
```
that returns the size in bytes.
0 讨论(0)
发布评论:

提交评论
- 加载中...
一个人的身影

2021-01-23 01:06
Use query, it should be a bit faster:
```
df = df.query("field == value")
```
0 讨论(0)
发布评论:

提交评论
- 加载中...

热议问题