pandas, apply multiple functions of multiple columns to groupby object

前端未结

关注

 6  1345

迷失自我 2021-02-13 12:55

I want to apply multiple functions of multiple columns to a groupby object which results in a new pandas.DataFrame.

I know how to do it in seperate steps:

6条回答

野性不改 (楼主)

2021-02-13 13:13
To use the agg method on a groupby object by using data from other columns of the same dataframe you could do the following:
1. Define your functions (lambda functions or not) that take as an input a Series, and get the data from other column(s) using the df.loc[series.index, col] syntax. With this example:
```
ed = lambda x: (x * lasts.loc[x.index, "num_cores"]).sum() / 86400. 
rd = lambda x: (x * lasts.loc[x.index, "num_cores"]).sum() / 86400.
```
  where lasts is the main DataFrame, and we access the data in the column num_cores thanks to the .loc method.
2. Create a dictionary with these functions and the name for the newly created columns. The keys are the name of the columns on which to apply each function, and the value is another dictionary where the key is the name of the function and the value is the function.
```
my_func = {"elapsed_time" : {"elapsed_day" : ed},
           "running_time" : {"running_days" : rd}}
```
3. Groupby and aggregate:
```
user_df = lasts.groupby("user").agg(my_func)
user_df
     elapsed_time running_time
      elapsed_day running_days
user                          
a        3.240741     2.430556
d       10.416667     3.819444
s        4.629630     1.851852
```
4. If you want to remove the old column names:
```
 user_df.columns = user_df.columns.droplevel(0)
 user_df
      elapsed_day  running_days
user                           
a        3.240741      2.430556
d       10.416667      3.819444
s        4.629630      1.851852
```
HTH
0 讨论(0)

查看其它6个回答
发布评论:

提交评论
- 加载中...