I want to apply multiple functions of multiple columns to a groupby object which results in a new pandas.DataFrame
.
I know how to do it in seperate steps:>
In response to the bounty, we can make it more general, by using partial application, from the standard libraries functools.partial
function.
import functools
import pandas as pd
#same data as other answer:
lasts = pd.DataFrame({'user':['a','s','d','d'],
'elapsed_time':[40000,50000,60000,90000],
'running_time':[30000,20000,30000,15000],
'num_cores':[7,8,9,4]})
#define the desired lambda as a function:
def myfunc(column, df, cores):
return (column * df.ix[column.index][cores]).sum()/86400
#use the partial to define the function with a given column and df:
mynewfunc = functools.partial(myfunc, df = lasts, cores = 'num_cores')
#agg by the partial function
lasts.groupby('user').agg({'elapsed_time':mynewfunc, 'running_time':mynewfunc})
Which gives us:
running_time elapsed_time
user
a 2.430556 3.240741
d 3.819444 10.416667
s 1.851852 4.629630
This is not super useful for the example given, but may be more useful as a general example.