问题
I have a Dataframe like this:
id points1 points2
1 44 53
1 76 34
1 63 66
2 23 34
2 44 56
I want output like this:
id points1 points2 points1_rank points2_rank
1 44 53 3 2
1 76 34 1 3
1 63 66 2 1
2 23 79 2 1
2 44 56 1 2
Basically, I want to groupby('id')
, and find the rank of each column with same id.
I tried this:
features = ["points1","points2"]
df = pd.merge(df, df.groupby('id')[features].rank().reset_index(), suffixes=["", "_rank"], how='left', on=['id'])
But I get keyerror 'id'
回答1:
You need to use ascending=False
inside rank
df.join(df.groupby('id')['points1', 'points2'].rank(ascending=False).astype(int).add_suffix('_rank'))
+---+----+---------+---------+--------------+--------------+
| | id | points1 | points2 | points1_rank | points2_rank |
+---+----+---------+---------+--------------+--------------+
| 0 | 1 | 44 | 53 | 3 | 2 |
| 1 | 1 | 76 | 34 | 1 | 3 |
| 2 | 1 | 63 | 66 | 2 | 1 |
| 3 | 2 | 23 | 34 | 2 | 2 |
| 4 | 2 | 44 | 56 | 1 | 1 |
+---+----+---------+---------+--------------+--------------+
回答2:
Use join with remove reset_index
and for change columns names add add_suffix:
features = ["points1","points2"]
df = df.join(df.groupby('id')[features].rank(ascending=False).add_suffix('_rank').astype(int))
print (df)
id points1 points2 points1_rank points2_rank
0 1 44 53 3 2
1 1 76 34 1 3
2 1 63 66 2 1
3 2 23 34 2 2
4 2 44 56 1 1
来源:https://stackoverflow.com/questions/54017968/how-to-rank-rows-by-id-in-pandas-python