I have a dataframe df
like this but much larger.
ID_0 ID_1 location
0 a b 1
1 a c 1
2 a b 0
3 d c 0
4
You need GroupBy.ngroup, new in 0.20.2
:
df['group_ID'] = df.groupby(['ID_0', 'ID_1']).ngroup()
print (df)
ID_0 ID_1 location group_ID
0 a b 1 0
1 a c 1 1
2 a b 0 0
3 d c 0 2
4 a c 0 1
5 a c 1 1
df['group_ID'] = df.groupby(['ID_0', 'ID_1']).grouper.group_info[0]
print (df)
ID_0 ID_1 location group_ID
0 a b 1 0
1 a c 1 1
2 a b 0 0
3 d c 0 2
4 a c 0 1
5 a c 1 1
This should do the trick without using the GroupBy.ngroup
which is only supported in newer pandas
versions:
df['group_ID'] = df.groupby(['ID_0', 'ID_1']).grouper.group_info[0]
ID_0 ID_1 location group_ID
0 a b 1 0
1 a c 1 1
2 a b 0 0
3 d c 0 2
4 a c 0 1
Find more information at this SO post: Python Pandas: How can I group by and assign an id to all the items in a group?