R Dplyr mutate, calculating standard deviation for each row

前端 未结 3 1465
我寻月下人不归
我寻月下人不归 2021-01-05 17:44

I am trying to calculate the mean and standard deviation from certain columns in a data frame, and return those values to new columns in the data frame. I can get this to wo

相关标签:
3条回答
  • 2021-01-05 17:50

    You can also write your own vectorised RowSD function as in

    RowSD <- function(x) {
      sqrt(rowSums((x - rowMeans(x))^2)/(dim(x)[2] - 1))
    }
    

    and then

    mtcars %>% 
      mutate(mean = (hp + drat + wt)/3, stdev = RowSD(cbind(hp, drat, wt)))
    ##     mpg cyl  disp  hp drat    wt  qsec vs am gear carb      mean     stdev
    ## 1  21.0   6 160.0 110 3.90 2.620 16.46  0  1    4    4  38.84000  61.62969
    ## 2  21.0   6 160.0 110 3.90 2.875 17.02  0  1    4    4  38.92500  61.55489
    ## 3  22.8   4 108.0  93 3.85 2.320 18.61  1  1    4    1  33.05667  51.91809
    ## 4  21.4   6 258.0 110 3.08 3.215 19.44  1  0    3    1  38.76500  61.69136
    ## 5  18.7   8 360.0 175 3.15 3.440 17.02  0  0    3    2  60.53000  99.13403
    ## 6  18.1   6 225.0 105 2.76 3.460 20.22  1  0    3    1  37.07333  58.82726
    ## ...
    
    0 讨论(0)
  • 2021-01-05 17:52

    You could try

    library(dplyr)
    library(matrixStats)
    nm1 <- c('hp', 'drat', 'wt')
    res1 <- mtcars %>% 
               mutate(Mean= rowMeans(.[nm1]), stdev=rowSds(as.matrix(.[nm1])))
    
    head(res1,3)
    #   mpg cyl disp  hp drat    wt  qsec vs am gear carb     Mean    stdev
    #1 21.0   6  160 110 3.90 2.620 16.46  0  1    4    4 38.84000 61.62969
    #2 21.0   6  160 110 3.90 2.875 17.02  0  1    4    4 38.92500 61.55489
    #3 22.8   4  108  93 3.85 2.320 18.61  1  1    4    1 33.05667 51.91809
    

    Or using do

    res2 <- mtcars %>% 
                 rowwise() %>%
                 do(data.frame(., Mean=mean(unlist(.[nm1])),
                             stdev=sd(unlist(.[nm1]))))
    
    head(res2,3)
    #   mpg cyl disp  hp drat    wt  qsec vs am gear carb     Mean    stdev
    #1 21.0   6  160 110 3.90 2.620 16.46  0  1    4    4 38.84000 61.62969
    #2 21.0   6  160 110 3.90 2.875 17.02  0  1    4    4 38.92500 61.55489
    #3 22.8   4  108  93 3.85 2.320 18.61  1  1    4    1 33.05667 51.91809
    
    0 讨论(0)
  • 2021-01-05 18:10

    Not much change needed, just add rowwise() (thanks @akrun for the comment) and wrap your column names in c(...) (to fix the error):

    library(dplyr)
    mtcars %>%
        rowwise() %>%
        mutate(mean=(hp+drat+wt)/3, stdev = sd(c(hp,drat,wt)))
    ## Source: local data frame [32 x 13]
    ## Groups: <by row>
    ##     mpg cyl  disp  hp drat    wt  qsec vs am gear carb     mean     stdev
    ## 1  21.0   6 160.0 110 3.90 2.620 16.46  0  1    4    4 38.84000  61.62969
    ## 2  21.0   6 160.0 110 3.90 2.875 17.02  0  1    4    4 38.92500  61.55489
    ## 3  22.8   4 108.0  93 3.85 2.320 18.61  1  1    4    1 33.05667  51.91809
    ## 4  21.4   6 258.0 110 3.08 3.215 19.44  1  0    3    1 38.76500  61.69136
    ## 5  18.7   8 360.0 175 3.15 3.440 17.02  0  0    3    2 60.53000  99.13403
    ## 6  18.1   6 225.0 105 2.76 3.460 20.22  1  0    3    1 37.07333  58.82726
    ## 7  14.3   8 360.0 245 3.21 3.570 15.84  0  0    3    4 83.92667 139.49371
    ## 8  24.4   4 146.7  62 3.69 3.190 20.00  1  0    4    2 22.96000  33.81056
    ## 9  22.8   4 140.8  95 3.92 3.150 22.90  1  0    4    2 34.02333  52.80875
    ## 10 19.2   6 167.6 123 3.92 3.440 18.30  1  0    4    4 43.45333  68.88985
    ## ..  ... ...   ... ...  ...   ...   ... .. ..  ...  ...      ...       ...
    
    0 讨论(0)
提交回复
热议问题