Pandas get topmost n records within each group

前端 未结 3 1604
无人共我
无人共我 2020-11-22 06:07

Suppose I have pandas DataFrame like this:

>>> df = pd.DataFrame({\'id\':[1,1,1,2,2,2,2,3,4],\'value\':[1,2,3,1,2,3,4,1,1]})
>>> df
   id           


        
3条回答
  •  隐瞒了意图╮
    2020-11-22 06:47

    Sometimes sorting the whole data ahead is very time consuming. We can groupby first and doing topk for each group:

    g = df.groupby(['id']).apply(lambda x: x.nlargest(topk,['value'])).reset_index(drop=True)
    

提交回复
热议问题