I have a dataframe which looks like this:
Each user has 10 records. Now, I want to create a dataframe which looks like this:
<groupby('userid')
then reset_index
within each group to enumerate consistently across groups. Then unstack
to get columns.
df.groupby('userid')['name'].apply(lambda df: df.reset_index(drop=True)).unstack()
df = pd.DataFrame([
[123, 'abc'],
[123, 'abc'],
[456, 'def'],
[123, 'abc'],
[123, 'abc'],
[456, 'def'],
[456, 'def'],
[456, 'def'],
], columns=['userid', 'name'])
df.sort_values('userid').groupby('userid')['name'].apply(lambda df: df.reset_index(drop=True)).unstack()
if you don't want the userid
as the index, add reset_index
to the end.
df.sort_values('userid').groupby('userid')['name'].apply(lambda df: df.reset_index(drop=True)).unstack().reset_index()