Python pandas idxmax for multiple indexes in a dataframe

后端 未结 3 911
小蘑菇
小蘑菇 2021-02-10 00:52

I have a series that looks like this:

            delivery
2007-04-26  706           23
2007-04-27  705           10
            706         1089
            708         


        
3条回答
  •  予麋鹿
    予麋鹿 (楼主)
    2021-02-10 01:03

    Your example code doesn't work because the idxmax is executed after the groupby operation (so on the whole dataframe)

    I'm not sure how to use idxmax on multilevel indexes, so here's a simple workaround.

    Setting up data :

    import pandas as pd
    d= {'Date': ['2007-04-26', '2007-04-27', '2007-04-27', '2007-04-27',
                 '2007-04-27', '2007-04-28', '2007-04-28'], 
            'DeliveryNb': [706, 705, 708, 450, 283, 45, 89],
            'DeliveryCount': [23, 10, 1089, 82, 34, 100, 11]}
    
    df = pd.DataFrame.from_dict(d, orient='columns').set_index('Date')
    print df
    

    output

                DeliveryCount  DeliveryNb
    Date                                 
    2007-04-26             23         706
    2007-04-27             10         705
    2007-04-27           1089         708
    2007-04-27             82         450
    2007-04-27             34         283
    2007-04-28            100          45
    2007-04-28             11          89
    

    creating custom function :

    The trick is to use the reset_index() method (so you easily get the integer index of the group)

    def func(df):
        idx = df.reset_index()['DeliveryCount'].idxmax()
        return df['DeliveryNb'].iloc[idx]
    

    applying it :

    g = df.groupby(df.index)
    g.apply(func)
    

    result :

    Date
    2007-04-26    706
    2007-04-27    708
    2007-04-28     45
    dtype: int64
    

提交回复
热议问题