How to filter out matrix rows with entries less than specific value

后端 未结 2 1552
别跟我提以往
别跟我提以往 2020-12-18 16:52

I\'m dealing with very big matrix. I want just keep the rows of the matrix which 90% of it\'s entries are bigger than 10. Since I\'m not familiar much with R, would someon

相关标签:
2条回答
  • 2020-12-18 17:33

    You can use apply and all to check which rows have all elements > 10

    big.mat <- matrix(rnorm(1000000, 20, 8), 1000, 1000)
    # Apply a function to each row of the matrix 
    # (so we pass 1 to apply, 2 would be columns)
    # all returns TRUE if all of the element of the vector we pass 
    # to it are TRUE
    good.lines <- apply(big.mat, 1, function(x){all(x>10)})
    # Lines that have > 90% elements > 10
    good.lines.90 <- apply(big.mat, 1, function(x){perc <- sum(x>10)/length(x); perc>0.9})
    
    filtered.mat <- big.mat[good.lines,]
    filtered.mat.90 <- big.mat[good.lines.90,]
    
    0 讨论(0)
  • 2020-12-18 17:40

    I would just use rowSums and common comparison operators.

    Here's a minimal example:

    set.seed(1); m <- matrix(sample(50, 100, TRUE), ncol = 10)
    rowSums(m > 10) == ncol(m)
    #  [1]  TRUE FALSE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
    m[rowSums(m > 10) == ncol(m), ]
    #      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
    # [1,]   14   11   47   25   42   24   46   17   22    12
    # [2,]   29   35   33   25   40   22   23   18   20    33
    

    To accommodate a fractional approach, try something like:

    m[rowSums(m > 10) >= (.9 * ncol(m)), ]
    
    0 讨论(0)
提交回复
热议问题