This is my first time asking a question.
I\'m working with a large CSV dataset (it contains over 15 million rows and is over 1.5 GB in size).
I\'m loading th
filter
is generally known to be slow when used with GroupBy
. If you are trying to filter a DataFrame based on a conditional inside a GroupBy, a better alternative is to use transform
or map
:
df[df.groupby('mac')['latency'].transform('count').gt(1)]
df[df['mac'].map(df.groupby('mac')['latency'].count()).gt(1)]