pandas groupby and rank within groups that start with 1 for each group

不想你离开。 提交于 2020-01-21 12:47:24

问题


I have a dataframe:

import pandas as pd

df = pd.DataFrame([[1, 'a'],
                    [1, 'a'],
                    [1, 'b'],
                    [1, 'a'],
                    [2, 'a'],
                    [2, 'b'],
                    [2, 'a'],
                    [2, 'b'],
                    [3, 'b'],
                    [3, 'a'],
                    [3, 'b'],

                   ], columns=['session', 'issue'])
df

I would like to rank issues within sessions. I tried with:

df.groupby(['session', 'issue']).size().rank(ascending=False, method='dense')

session  issue
1        a        1.0
         b        3.0
2        a        2.0
         b        2.0
3        a        3.0
         b        2.0
dtype: float64

What I need is result like this one:

  1. for group session=1, there are three a issues and one b issue, so for group 1, ranks are a = 1 and b = 2
  2. for group session=2, both ranks are equal so their rank should be the same = 1
  3. for group session=3, there are to b issues and one a issue, so ranks should be b=1 and a=2

Anyway, why for each group ranks don't start from 1, 2, 3...?


回答1:


Use DataFrameGroupBy.rank by first level of MultiIndex (session):

s = (df.groupby(['session', 'issue'])
        .size()
        .groupby(level=0)
        .rank(ascending=False, method='dense'))
print (s)
session  issue
1        a        1.0
         b        2.0
2        a        1.0
         b        1.0
3        a        2.0
         b        1.0
dtype: float64


来源:https://stackoverflow.com/questions/54530503/pandas-groupby-and-rank-within-groups-that-start-with-1-for-each-group

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!