R - I want to go through rows of a big matrix and remove all zeros

后端未结

关注

 5  1075

I have a lot of rows and columns in a very large matrix (184 x 4000, type double), and I want to remove all 0\'s. The values in the matrix are usually greater than 0 but the

相关标签:

5条回答

梦如初夏

2021-01-07 11:30
Try this for removing the rows that contain only zeros.
```
x[!apply(x == 0, 1, all), , drop = FALSE]
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
清歌不尽

2021-01-07 11:32
I finally have the answer. The reason why
```
x<- x[which(rowSums(x) > 0),]
```
only returned 3 rows out of 184 was because this function only gives you those rows that do not sum up to 0 and/or do not have an NA in them. And I had a few NA's in all but 3 rows, I just wasn't aware of. Simply taking out the NA's did not work, because that didn't solve the rowSums problem. I needed the function to treat my NA's as zeros, so that the rows that did entail NA's (as in all but 3) would also be summed up and not just taken out of the matrix. So I turned all NA's into zeros by using
```
x[is.na(x)] <- 0
```
and THEN applying the function to sum up all rows and remove the ones that add up to 0. And it worked! Thanks to everyone for your input. Especially @arkun!
0 讨论(0)
发布评论:

提交评论
- 加载中...
情深已故

2021-01-07 11:32
This worked for me, slightly change of @Richard Scriven:
```
remove_zeros<- function(x)
{
  x = x[!apply(x == 0, 1, all),]
  return(x)
}
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
不知归路

2021-01-07 11:36
You can drop rows which only contain 0s like this (and you could replace 0 with any other number if you wanted to drop rows with only that number):
```
x <- x[rowSums(x == 0) != ncol(x),]
```
Explanation:
- x == 0 creates a matrix of logical values (TRUE/FALSE) and rowSums(x == 0) sums them up (TRUE == 1, FALSE == 0).
- Then you check if the sum of each row is not equal to the number of columns of your matrix (which are counted by ncol(x)).
- If that is the case (which means not all entries are 0s), the row will be kept because it evaluates to TRUE. All other rows evaluate to FALSE and will be dropped.
0 讨论(0)
发布评论:

提交评论
- 加载中...

北荒

2021-01-07 11:36

You could try:

x[!rowSums(!x)==ncol(x),] #could be shortened to

x[!!rowSums(abs(x)),] #Inspired from @Richard Scriven

data

 x <- structure(list(V1 = c(2, 0, 2, 2, 2, 3, 2, 0, 0, 3), V2 = c(2, 
   0, 0, 2, 3, 1, 0, 0, 0, 0), V3 = c(3, 0, 1, 3, 3, 2, 0, 3, 0, 
  1), V4 = c(3, 0, 2, 3, 2, 2, 2, 1, 2, 1), V5 = c(0, 0, 0, 0, 
  1, 2, 2, 2, 1, 3)), .Names = c("V1", "V2", "V3", "V4", "V5"), row.names = c(NA, 
  -10L), class = "data.frame")

!x. Creates a logical index of TRUE and FALSE, where TRUE will be elements that are 0's
rowSums(!x). rowwise Sum of those TRUEs,
==ncol(x). Check whether the sum is equal to the number of columns. In the above example it is 5. That means all entries are 0
! Negate again because we want to filter out these rows
Subset x using this logical index

Update

Suppose you have NA's in your dataset and you want to remove rows with all 0's or those with 0's and NA's, for e.g.

 x <-   structure(list(V1 = c(2, 0, 2, 2, 2, 3, 2, 0, 0, 3), V2 = c(2, 
 NA, 0, 2, 3, 1, 0, 0, 0, 0), V3 = c(3, 0, 1, 3, 3, 2, 0, 3, 0, 
 1), V4 = c(3, 0, 2, 3, 2, 2, NA, 1, 2, 1), V5 = c(0, 0, 0, 0, 
 1, 2, 2, 2, 1, 3)), .Names = c("V1", "V2", "V3", "V4", "V5"), row.names = c(NA, 
 -10L), class = "data.frame")

 x[!(rowSums(!is.na(x) & !x)+rowSums(is.na(x)))==ncol(x),]

The idea is to first sum the NAs rowwise
Rowwise sum of all the elements that are not NAs and are 0's rowSUms(!is.na(x) & !x)
Take the sum of the above two. If that number matches with the number of columns, delete that row

0 讨论(0)