Replicating GROUP_CONCAT for pandas.DataFrame

后端 未结 2 1190
误落风尘
误落风尘 2020-12-23 15:10

I have a pandas DataFrame df:

+------+---------+  
| team | user    |  
+------+---------+  
| A    | elmer   |  
| A    | daffy   |  
| A    | bugs    |  
|         


        
相关标签:
2条回答
  • 2020-12-23 15:50

    Do the following:

    df.groupby('team').apply(lambda x: ','.join(x.user))
    

    to get a Series of strings or

    df.groupby('team').apply(lambda x: list(x.user))
    

    to get a Series of lists of strings.

    Here's what the results look like:

    In [33]: df.groupby('team').apply(lambda x: ', '.join(x.user))
    Out[33]:
    team
    a       elmer, daffy, bugs, foghorn, goofy, marvin
    b                               dawg, speedy, pepe
    c                                   petunia, porky
    dtype: object
    
    In [34]: df.groupby('team').apply(lambda x: list(x.user))
    Out[34]:
    team
    a       [elmer, daffy, bugs, foghorn, goofy, marvin]
    b                               [dawg, speedy, pepe]
    c                                   [petunia, porky]
    dtype: object
    

    Note that in general any further operations on these types of Series will be slow and are generally discouraged. If there's another way to aggregate without putting a list inside of a Series you should consider using that approach instead.

    0 讨论(0)
  • 2020-12-23 16:01

    A more general solution if you want to use agg:

    df.groupby('team').agg({'user' : lambda x: ', '.join(x)})
    
    0 讨论(0)
提交回复
热议问题