pandas, apply multiple functions of multiple columns to groupby object

前端 未结 6 1359
迷失自我
迷失自我 2021-02-13 12:55

I want to apply multiple functions of multiple columns to a groupby object which results in a new pandas.DataFrame.

I know how to do it in seperate steps:

6条回答
  •  遇见更好的自我
    2021-02-13 13:05

    In response to the bounty, we can make it more general, by using partial application, from the standard libraries functools.partial function.

    import functools
    import pandas as pd
    
    #same data as other answer:
    lasts = pd.DataFrame({'user':['a','s','d','d'],
                       'elapsed_time':[40000,50000,60000,90000],
                       'running_time':[30000,20000,30000,15000],
                       'num_cores':[7,8,9,4]})
    
    #define the desired lambda as a function:
    def myfunc(column, df, cores):
        return (column * df.ix[column.index][cores]).sum()/86400
    
    #use the partial to define the function with a given column and df:
    mynewfunc = functools.partial(myfunc, df = lasts, cores = 'num_cores')
    
    #agg by the partial function
    lasts.groupby('user').agg({'elapsed_time':mynewfunc, 'running_time':mynewfunc})
    

    Which gives us:

        running_time    elapsed_time
    user        
    a   2.430556    3.240741
    d   3.819444    10.416667
    s   1.851852    4.629630
    

    This is not super useful for the example given, but may be more useful as a general example.

提交回复
热议问题