I\'m dealing with very big matrix. I want just keep the rows of the matrix which 90% of it\'s entries are bigger than 10. Since I\'m not familiar much with R, would someon
You can use apply
and all
to check which rows have all elements > 10
big.mat <- matrix(rnorm(1000000, 20, 8), 1000, 1000)
# Apply a function to each row of the matrix
# (so we pass 1 to apply, 2 would be columns)
# all returns TRUE if all of the element of the vector we pass
# to it are TRUE
good.lines <- apply(big.mat, 1, function(x){all(x>10)})
# Lines that have > 90% elements > 10
good.lines.90 <- apply(big.mat, 1, function(x){perc <- sum(x>10)/length(x); perc>0.9})
filtered.mat <- big.mat[good.lines,]
filtered.mat.90 <- big.mat[good.lines.90,]
I would just use rowSums
and common comparison operators.
Here's a minimal example:
set.seed(1); m <- matrix(sample(50, 100, TRUE), ncol = 10)
rowSums(m > 10) == ncol(m)
# [1] TRUE FALSE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
m[rowSums(m > 10) == ncol(m), ]
# [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
# [1,] 14 11 47 25 42 24 46 17 22 12
# [2,] 29 35 33 25 40 22 23 18 20 33
To accommodate a fractional approach, try something like:
m[rowSums(m > 10) >= (.9 * ncol(m)), ]