R - I want to go through rows of a big matrix and remove all zeros

后端 未结 5 1075
星月不相逢
星月不相逢 2021-01-07 10:55

I have a lot of rows and columns in a very large matrix (184 x 4000, type double), and I want to remove all 0\'s. The values in the matrix are usually greater than 0 but the

相关标签:
5条回答
  • 2021-01-07 11:30

    Try this for removing the rows that contain only zeros.

    x[!apply(x == 0, 1, all), , drop = FALSE]
    
    0 讨论(0)
  • 2021-01-07 11:32

    I finally have the answer. The reason why

    x<- x[which(rowSums(x) > 0),]
    

    only returned 3 rows out of 184 was because this function only gives you those rows that do not sum up to 0 and/or do not have an NA in them. And I had a few NA's in all but 3 rows, I just wasn't aware of. Simply taking out the NA's did not work, because that didn't solve the rowSums problem. I needed the function to treat my NA's as zeros, so that the rows that did entail NA's (as in all but 3) would also be summed up and not just taken out of the matrix. So I turned all NA's into zeros by using

    x[is.na(x)] <- 0
    

    and THEN applying the function to sum up all rows and remove the ones that add up to 0. And it worked! Thanks to everyone for your input. Especially @arkun!

    0 讨论(0)
  • 2021-01-07 11:32

    This worked for me, slightly change of @Richard Scriven:

    remove_zeros<- function(x)
    {
      x = x[!apply(x == 0, 1, all),]
      return(x)
    }
    
    0 讨论(0)
  • 2021-01-07 11:36

    You can drop rows which only contain 0s like this (and you could replace 0 with any other number if you wanted to drop rows with only that number):

    x <- x[rowSums(x == 0) != ncol(x),]
    

    Explanation:

    • x == 0 creates a matrix of logical values (TRUE/FALSE) and rowSums(x == 0) sums them up (TRUE == 1, FALSE == 0).
    • Then you check if the sum of each row is not equal to the number of columns of your matrix (which are counted by ncol(x)).
    • If that is the case (which means not all entries are 0s), the row will be kept because it evaluates to TRUE. All other rows evaluate to FALSE and will be dropped.
    0 讨论(0)
  • 2021-01-07 11:36

    You could try:

    x[!rowSums(!x)==ncol(x),] #could be shortened to
    
    x[!!rowSums(abs(x)),] #Inspired from @Richard Scriven
    

    data

     x <- structure(list(V1 = c(2, 0, 2, 2, 2, 3, 2, 0, 0, 3), V2 = c(2, 
       0, 0, 2, 3, 1, 0, 0, 0, 0), V3 = c(3, 0, 1, 3, 3, 2, 0, 3, 0, 
      1), V4 = c(3, 0, 2, 3, 2, 2, 2, 1, 2, 1), V5 = c(0, 0, 0, 0, 
      1, 2, 2, 2, 1, 3)), .Names = c("V1", "V2", "V3", "V4", "V5"), row.names = c(NA, 
      -10L), class = "data.frame")
    
     
    
    • !x. Creates a logical index of TRUE and FALSE, where TRUE will be elements that are 0's
    • rowSums(!x). rowwise Sum of those TRUEs,
    • ==ncol(x). Check whether the sum is equal to the number of columns. In the above example it is 5. That means all entries are 0
    • ! Negate again because we want to filter out these rows
    • Subset x using this logical index

    Update

    Suppose you have NA's in your dataset and you want to remove rows with all 0's or those with 0's and NA's, for e.g.

     x <-   structure(list(V1 = c(2, 0, 2, 2, 2, 3, 2, 0, 0, 3), V2 = c(2, 
     NA, 0, 2, 3, 1, 0, 0, 0, 0), V3 = c(3, 0, 1, 3, 3, 2, 0, 3, 0, 
     1), V4 = c(3, 0, 2, 3, 2, 2, NA, 1, 2, 1), V5 = c(0, 0, 0, 0, 
     1, 2, 2, 2, 1, 3)), .Names = c("V1", "V2", "V3", "V4", "V5"), row.names = c(NA, 
     -10L), class = "data.frame")
    
     x[!(rowSums(!is.na(x) & !x)+rowSums(is.na(x)))==ncol(x),]
    
    • The idea is to first sum the NAs rowwise

    • Rowwise sum of all the elements that are not NAs and are 0's rowSUms(!is.na(x) & !x)

    • Take the sum of the above two. If that number matches with the number of columns, delete that row

    0 讨论(0)
提交回复
热议问题