pandas groupby: TOP 3 values for each group

后端 未结 1 1683
说谎
说谎 2020-12-20 07:23

A new and more generic question has been posted in pandas groupby: TOP 3 values in each group and store in DataFrame and a working solution has been answered there.<

相关标签:
1条回答
  • 2020-12-20 07:31

    NOTE: This solution works only if each group has at least 3 rows

    Try the following approach:

    In [59]: x = (df.groupby(pd.Grouper(freq='H'))['VAL']
                    .apply(lambda x: x.nlargest(3))
                    .reset_index(level=1, drop=True)
                    .to_frame('VAL'))
    
    In [60]: x
    Out[60]:
                         VAL
    TIME
    2017-12-08 00:00:00   82
    2017-12-08 00:00:00   56
    2017-12-08 00:00:00   53
    2017-12-08 01:00:00   95
    2017-12-08 01:00:00   87
    2017-12-08 01:00:00   79
    2017-12-08 02:00:00   88
    2017-12-08 02:00:00   78
    2017-12-08 02:00:00   41
    
    In [61]: x.set_index(np.arange(len(x)) % 3, append=True)['VAL'].unstack().add_prefix('VAL')
    Out[61]:
                         VAL0  VAL1  VAL2
    TIME
    2017-12-08 00:00:00    82    56    53
    2017-12-08 01:00:00    95    87    79
    2017-12-08 02:00:00    88    78    41
    

    Some explanation:

    In [94]: x.set_index(np.arange(len(x)) % 3, append=True)
    Out[94]:
                           VAL
    TIME
    2017-12-08 00:00:00 0   82
                        1   56
                        2   53
    2017-12-08 01:00:00 0   95
                        1   87
                        2   79
    2017-12-08 02:00:00 0   88
                        1   78
                        2   41
    
    In [95]: x.set_index(np.arange(len(x)) % 3, append=True)['VAL'].unstack()
    Out[95]:
                          0   1   2
    TIME
    2017-12-08 00:00:00  82  56  53
    2017-12-08 01:00:00  95  87  79
    2017-12-08 02:00:00  88  78  41
    
    0 讨论(0)
提交回复
热议问题