How to get value of a column based on the maximum of another column in case of DataFrame.groupby

戏子无情 提交于 2021-01-03 04:27:26

问题


I have a dataframe which looks like this.

id YearReleased Artist count 168 2015 Muse 1 169 2015 Rihanna 3 170 2015 Taylor Swift 2 171 2016 Jennifer Lopez 1 172 2016 Rihanna 3 173 2016 Underworld 1 174 2017 Coldplay 1 175 2017 Ed Sheeran 2

I want to get the maximum count for each year and then get the corresponding Artist name.

Something like this:

YearReleased Artist

2015 Rihanna
2016 Rihanna
2017 Ed Sheeran

I have tried using a loop to iterate over the rows of the dataframe and create another dictionary with key as year and value as artist. But when I try to convert that dictionary to a dataframe, the keys are mapped to columns instead of rows.

Can somebody guide me to have a better approach to this without having to loop over the dataframe and instead use some inbuilt pandas method to achieve this?


回答1:


Look at idxmax

df.loc[df.groupby('YearReleased')['count'].idxmax()]
Out[445]: 
    id  YearReleased     Artist  count
1  169          2015    Rihanna      3
4  172          2016    Rihanna      3
7  175          2017  EdSheeran      2



回答2:


You can use groupby and transform :

idx = df.groupby(['YearReleased'])['count'].transform(max) == df['count']

and then use this indexer:

df[idx]
Out[14]: 
    id  YearReleased      Artist  count
1  169          2015     Rihanna      3
4  172          2016     Rihanna      3
7  175          2017  Ed Sheeran      2


来源:https://stackoverflow.com/questions/49263437/how-to-get-value-of-a-column-based-on-the-maximum-of-another-column-in-case-of-d

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!