发表新帖

发表新帖

How do I improve the performance of pandas GroupBy filter operation?

后端未结

关注

 1  1307

醉话见心 2021-01-13 00:50

This is my first time asking a question.

I\'m working with a large CSV dataset (it contains over 15 million rows and is over 1.5 GB in size).

I\'m loading th

1条回答

伪装坚强ぢ (楼主)

2021-01-13 01:07
filter is generally known to be slow when used with GroupBy. If you are trying to filter a DataFrame based on a conditional inside a GroupBy, a better alternative is to use transform or map:
```
df[df.groupby('mac')['latency'].transform('count').gt(1)]
```
```
df[df['mac'].map(df.groupby('mac')['latency'].count()).gt(1)]
```
0 讨论(0)
发布评论:

提交评论
- 加载中...

热议问题