How to group “remaining” results beyond Top N into “Others” with pandas

后端 未结 2 1355
孤独总比滥情好
孤独总比滥情好 2021-02-11 01:05

When group a pandas dataframe by one column say \"version\" and which has 10 distinct versions. How can one plot the Top 3 (which cover over 90%) and put the small remainders in

2条回答
  •  死守一世寂寞
    2021-02-11 01:38

    # number of top-n you want
    n = 2
    
    # group by & sort descending
    df_sorted = (df
                    .groupby('Version').sum()
                    .sort_values('Value', ascending=False)
                    .reset_index()
                )
    
    # rename rows other than top-n to 'Others'
    df_sorted.loc[df_sorted.index >= n, 'Version'] = 'Others'
    
    # re-group by again
    df_sorted.groupby('Version').sum()
    

提交回复
热议问题