Sum only numerical columns and divide values by total

前端 未结 2 1311
不思量自难忘°
不思量自难忘° 2021-01-24 18:27

I am having trouble with some calculations on a data frame.

Here is my DF (with many more rows and columns)

What I am trying to do is:

Step (1) - For e

2条回答
  •  盖世英雄少女心
    2021-01-24 19:01

    We can first separate "Assets" and "Returns" columns and then use colSums and multiply values accordingly

    asset_col <- grep("^Assets", names(df1))
    return_col <- grep("^Returns", names(df1))
    colSums(t(t(df1[asset_col])/colSums(df1[asset_col])) * df1[return_col])
    
    #Returns_Jan_2018 Returns_Feb_2018 
    #        3.504230         4.633941 
    

    To break it down and make clear each step

    Step 1 - For each month I would like to sum the assets columns

    colSums(df1[asset_col])
    #Assets_Jan_2018 Assets_Feb_2018 
    #    1466742         2049689 
    

    Step 2 - For each firm, I would like to divide assets each month by the total for the month

    t(t(df1[asset_col])/colSums(df1[asset_col]))
    #     Assets_Jan_2018 Assets_Feb_2018
    #[1,]      0.14333400      0.11485889
    #[2,]      0.08395751      0.06202112
    #[3,]      0.61217106      0.38532577
    #[4,]      0.16053744      0.43779422
    

    Step 3 - Then I would like to take the values from step (2) and multiply by the corresponding returns

    t(t(df1[asset_col])/colSums(df1[asset_col])) * df1[return_col]
    
    #  Returns_Jan_2018 Returns_Feb_2018
    #1        0.6450030       0.76955455
    #2        0.4449748       0.07442534
    #3        0.8570395       2.38901980
    #4        1.5572131       1.40094151
    

    Step 4 - I would like to sum each column in step (3)

    colSums(t(t(df1[asset_col])/colSums(df1[asset_col])) * df1[return_col])
    
    #Returns_Jan_2018 Returns_Feb_2018 
    #        3.504230         4.633941 
    

提交回复
热议问题