Pandas groupby(),agg() - how to return results without the multi index?

天大地大妈咪最大 提交于 2019-12-18 03:17:52

问题


I have a dataframe:

pe_odds[ [ 'EVENT_ID', 'SELECTION_ID', 'ODDS' ] ]
Out[67]: 
     EVENT_ID  SELECTION_ID   ODDS
0   100429300       5297529  18.00
1   100429300       5297529  20.00
2   100429300       5297529  21.00
3   100429300       5297529  22.00
4   100429300       5297529  23.00
5   100429300       5297529  24.00
6   100429300       5297529  25.00

When I use groupby and agg, I get results with a multi-index:

pe_odds.groupby( [ 'EVENT_ID', 'SELECTION_ID' ] )[ 'ODDS' ].agg( [ np.min, np.max ] )
Out[68]: 
                         amin   amax
EVENT_ID  SELECTION_ID              
100428417 5490293        1.71   1.71
          5881623        1.14   1.35
          5922296        2.00   2.00
          5956692        2.00   2.02
100428419 603721         2.44   2.90
          4387436        4.30   6.20
          4398859        1.23   1.35
          4574687        1.35   1.46
          4881396       14.50  19.00
          6032606        2.94   4.20
          6065580        2.70   5.80
          6065582        2.42   3.65
100428421 5911426        2.22   2.52

I have tried using as_index to return the results without the multi_index:

pe_odds.groupby( [ 'EVENT_ID', 'SELECTION_ID' ], as_index=False )[ 'ODDS' ].agg( [ np.min, np.max ], as_index=False )

But it still gives me a multi-index.

I can use .reset_index(), but it is very slow:

pe_odds.groupby( [ 'EVENT_ID', 'SELECTION_ID' ] )[ 'ODDS' ].agg( [ np.min, np.max ] ).reset_index()

pe_odds.groupby( [ 'EVENT_ID', 'SELECTION_ID' ] )[ 'ODDS' ].agg( [ np.min, np.max ] ).reset_index()
Out[69]: 
     EVENT_ID  SELECTION_ID   amin   amax
0   100428417       5490293   1.71   1.71
1   100428417       5881623   1.14   1.35
2   100428417       5922296   2.00   2.00
3   100428417       5956692   2.00   2.02
4   100428419        603721   2.44   2.90
5   100428419       4387436   4.30   6.20

How can I return the results, without the Multi-index, using parameters of the groupby and/or agg function. And without having to resort to using reset_index() ?


回答1:


Below call:

>>> gr = df.groupby(['EVENT_ID', 'SELECTION_ID'], as_index=False)
>>> res = gr.agg({'ODDS':[np.min, np.max]})
>>> res
    EVENT_ID SELECTION_ID ODDS     
                          amin amax
0  100429300      5297529   18   25
1  100429300      5297559   30   38

returns a frame with mulit-index columns. If you do not want columns to be multi-index either you may do:

>>> res.columns = list(map(''.join, res.columns.values))
>>> res
    EVENT_ID  SELECTION_ID  ODDSamin  ODDSamax
0  100429300       5297529        18        25
1  100429300       5297559        30        38


来源:https://stackoverflow.com/questions/26323926/pandas-groupby-agg-how-to-return-results-without-the-multi-index

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!