I have a pandas DataFrame df:
+------+---------+
| team | user |
+------+---------+
| A | elmer |
| A | daffy |
| A | bugs |
|
Do the following:
df.groupby('team').apply(lambda x: ','.join(x.user))
to get a Series
of strings or
df.groupby('team').apply(lambda x: list(x.user))
to get a Series
of list
s of strings.
Here's what the results look like:
In [33]: df.groupby('team').apply(lambda x: ', '.join(x.user))
Out[33]:
team
a elmer, daffy, bugs, foghorn, goofy, marvin
b dawg, speedy, pepe
c petunia, porky
dtype: object
In [34]: df.groupby('team').apply(lambda x: list(x.user))
Out[34]:
team
a [elmer, daffy, bugs, foghorn, goofy, marvin]
b [dawg, speedy, pepe]
c [petunia, porky]
dtype: object
Note that in general any further operations on these types of Series
will be slow and are generally discouraged. If there's another way to aggregate without putting a list
inside of a Series
you should consider using that approach instead.
A more general solution if you want to use agg
:
df.groupby('team').agg({'user' : lambda x: ', '.join(x)})