Drop rows corresponding to groups smaller than specified size

前端 未结 2 766
滥情空心
滥情空心 2021-01-22 20:20

I have a DataFrame of answers for 100 questions_id and 50 user_id\'s. Each row represents a single question from a specific user. The ta

2条回答
  •  [愿得一人]
    2021-01-22 20:41

    Use groupby and filter, very succinct and intended for this purpose.

    df1 = df.groupby('user_id').filter(lambda x: len(x) > 100)
    

    For better performance, use np.unique and map:

    m = dict(zip(*np.unique(df.user_id, return_counts=True)))
    df[df['user_id'].map(m) > 100]
    

提交回复
热议问题