I have a lot of rows and columns in a very large matrix (184 x 4000, type double), and I want to remove all 0\'s. The values in the matrix are usually greater than 0 but the
You could try:
x[!rowSums(!x)==ncol(x),] #could be shortened to
x[!!rowSums(abs(x)),] #Inspired from @Richard Scriven
x <- structure(list(V1 = c(2, 0, 2, 2, 2, 3, 2, 0, 0, 3), V2 = c(2,
0, 0, 2, 3, 1, 0, 0, 0, 0), V3 = c(3, 0, 1, 3, 3, 2, 0, 3, 0,
1), V4 = c(3, 0, 2, 3, 2, 2, 2, 1, 2, 1), V5 = c(0, 0, 0, 0,
1, 2, 2, 2, 1, 3)), .Names = c("V1", "V2", "V3", "V4", "V5"), row.names = c(NA,
-10L), class = "data.frame")
!x
. Creates a logical index of TRUE and FALSE, where TRUE will be elements that are 0'srowSums(!x)
. rowwise Sum of those TRUEs,==ncol(x)
. Check whether the sum is equal to the number of columns. In the above example it is 5. That means all entries are 0!
Negate again because we want to filter out these rowsx
using this logical indexSuppose you have NA's in your dataset and you want to remove rows with all 0's or those with 0's and NA's, for e.g.
x <- structure(list(V1 = c(2, 0, 2, 2, 2, 3, 2, 0, 0, 3), V2 = c(2,
NA, 0, 2, 3, 1, 0, 0, 0, 0), V3 = c(3, 0, 1, 3, 3, 2, 0, 3, 0,
1), V4 = c(3, 0, 2, 3, 2, 2, NA, 1, 2, 1), V5 = c(0, 0, 0, 0,
1, 2, 2, 2, 1, 3)), .Names = c("V1", "V2", "V3", "V4", "V5"), row.names = c(NA,
-10L), class = "data.frame")
x[!(rowSums(!is.na(x) & !x)+rowSums(is.na(x)))==ncol(x),]
The idea is to first sum the NAs rowwise
Rowwise sum of all the elements that are not NAs and are 0's rowSUms(!is.na(x) & !x)
Take the sum of the above two. If that number matches with the number of columns, delete that row