Groupby with User Defined Functions Pandas

后端 未结 1 744
清歌不尽
清歌不尽 2021-01-30 21:45

I understand that passing a function as a group key calls the function once per index value with the return values being used as the group names. What I can\'t figure out is how

相关标签:
1条回答
  • 2021-01-30 22:39

    To group by a > 1, you can define your function like:

    >>> def GroupColFunc(df, ind, col):
    ...     if df[col].loc[ind] > 1:
    ...         return 'Group1'
    ...     else:
    ...         return 'Group2'
    ... 
    

    An then call it like

    >>> people.groupby(lambda x: GroupColFunc(people, x, 'a')).sum()
                   a         b         c         d        e
    Group2 -2.384614 -0.762208  3.359299 -1.574938 -2.65963
    

    Or you can do it only with anonymous function:

    >>> people.groupby(lambda x: 'Group1' if people['b'].loc[x] > people['a'].loc[x] else 'Group2').sum()
                   a         b         c         d         e
    Group1 -3.280319 -0.007196  1.525356  0.324154 -1.002439
    Group2  0.895705 -0.755012  1.833943 -1.899092 -1.657191
    

    As said in documentation, you can also group by passing Series providing a label -> group name mapping:

    >>> mapping = np.where(people['b'] > people['a'], 'Group1', 'Group2')
    >>> mapping
    Joe       Group2
    Steve     Group1
    Wes       Group2
    Jim       Group1
    Travis    Group1
    dtype: string48
    >>> people.groupby(mapping).sum()
                   a         b         c         d         e
    Group1 -3.280319 -0.007196  1.525356  0.324154 -1.002439
    Group2  0.895705 -0.755012  1.833943 -1.899092 -1.657191
    
    0 讨论(0)
提交回复
热议问题