Change row values to zero if less than row standard deviation

前端 未结 2 457
小鲜肉
小鲜肉 2021-01-19 04:58

I want to change all values of a row to zero if they are less than the standard deviation of that row.

set.seed(007)
X <- data.frame(matrix(sample(c(5:50)         


        
2条回答
  •  迷失自我
    2021-01-19 05:41

    I suspect this is slower that the apply solution, but since there is no need to add the data.frame step and the fact that apply.data.frame is notoriously slow, I may still "win" or "keep even" at least until the other contestants tumble to the fact that I use a matrix object.

    set.seed(007)
    X <- matrix(sample(c(5:50), 100, replace=TRUE), ncol=10)
    X[ sweep(X, 1, apply(X,1,sd) ) < 0 ] <- 0
    

    Note that Richardo and I both got the same different starting point than the OP although I think he needed to transpose if he wants a row operation:

    > X
       X1 X2 X3 X4 X5 X6 X7 X8 X9 X10
    1  50  0 34 36 41 31  0 18 45  20
    2  23 15 18 17 22 38 28 32 45   0
    3   0 40 50  0 39 40 40 43 16  46
    4   0  0 46  0 25 33 36 33 39   0
    5  16 25 50 22 46 38 30  0 22  38
    6  41  0  0 43 19 22 35 31  0  31
    7  20 30 33 27  0 12 26 25  0  29
    8  49  0 27 41 42  0 27 25 40  21
    9   0 50 49 43 46 22 20 33 21  42
    10 26 19 21 26 49 17 24 47 24  13
    

    Added note: I was playing around with the rowMeans function to see if I could come up with a vectorized alternative to apply(X,1,sd) version of sd():

    sqrt(rowSums((X[1:10, ]-rowMeans(X))^2)/9)
    

    So:

     sdbyrow <- function(mat) sqrt(rowSums((mat-rowMeans(mat))^2)/(ncol(mat)-1) )
     all.equal(apply(X,1,sd), sdbyrow(X) )
    #[1] TRUE
    

提交回复
热议问题