问题
I have a df like this:
Id username age
1 michael. 34
6. Mike. 65
7. Stephanie. 14
1. Mikael. 34
6. Mick. 65
As you can see, username are not writed the same for the same id. I would like to regroup all username to the same row like this:
Id username username_2 Age
1 michael. mikael. 34
6. Mike. Mick. 65
7. Stephanie. 14
Thanks.
回答1:
You can create MultiIndex
for count duplicated Id
by cumcount and then is possible reshape by unstack, last some data cleaning by add_prefix with reset_index:
df1 = (df.set_index(['Id', df.groupby('Id').cumcount()])['username']
.unstack(fill_value='')
.add_prefix('username_')
.reset_index())
print (df1)
Id username_0 username_1
0 1.0 michael Mikael
1 6.0 Mike Mick
2 7.0 Stephanie
Or rename
columns for start from 1
:
df1 = (df.set_index(['Id', df.groupby('Id').cumcount()])['username']
.unstack(fill_value='')
.rename(columns = lambda x: f'username_{x+1}')
.reset_index())
print (df1)
Id username_1 username_2
0 1.0 michael Mikael
1 6.0 Mike Mick
2 7.0 Stephanie
来源:https://stackoverflow.com/questions/54802003/create-columns-from-row-with-same-id