How to remove rows with any zero value

前端 未结 8 624
时光取名叫无心
时光取名叫无心 2020-11-28 07:04

I have a problem to solve how to remove rows with a Zero value in R. In others hand, I can use na.omit() to delete all the NA values or use complete.cases

相关标签:
8条回答
  • 2020-11-28 07:28

    I would do the following.

    Set the zero to NA.

     data[data==0] <- NA
     data
    

    Delete the rows associated with NA.

     data2<-data[complete.cases(data),]
    
    0 讨论(0)
  • 2020-11-28 07:28

    Using tidyverse/dplyr, you can also remove rows with any zero value in a subset of variables:

    # variables starting with Mac must be non-zero
    filter_at(df, vars(starts_with("Mac")), all_vars((.) != 0))
    
    # variables x, y, and z must be non-zero
    filter_at(df, vars(x, y, z), all_vars((.) != 0))
    
    # all numeric variables must be non-zero
    filter_if(df, is.numeric, all_vars((.) != 0))
    
    0 讨论(0)
  • 2020-11-28 07:32

    I would probably go with Joran's suggestion of replacing 0's with NAs and then using the built in functions you mentioned. If you can't/don't want to do that, one approach is to use any() to find rows that contain 0's and subset those out:

    set.seed(42)
    #Fake data
    x <- data.frame(a = sample(0:2, 5, TRUE), b = sample(0:2, 5, TRUE))
    > x
      a b
    1 2 1
    2 2 2
    3 0 0
    4 2 1
    5 1 2
    #Subset out any rows with a 0 in them
    #Note the negation with ! around the apply function
    x[!(apply(x, 1, function(y) any(y == 0))),]
      a b
    1 2 1
    2 2 2
    4 2 1
    5 1 2
    

    To implement Joran's method, something like this should get you started:

    x[x==0] <- NA
    
    0 讨论(0)
  • 2020-11-28 07:32

    In base R, we can select the columns which we want to test using grep, compare the data with 0, use rowSums to select rows which has all non-zero values.

    cols <- grep("^Mac", names(df))
    df[rowSums(df[cols] != 0) == length(cols), ]
    
    #          DateTime Mac1 Mac2 Mac3 Mac4
    #1 2011-04-02 06:05   21   21   21   21
    #2 2011-04-02 06:10   22   22   22   22
    #3 2011-04-02 06:20   24   24   24   24
    

    Doing this with inverted logic but giving the same output

    df[rowSums(df[cols] == 0) == 0, ]
    

    In dplyr, we can use filter_at to test for specific columns and use all_vars to select rows where all the values are not equal to 0.

    library(dplyr)
    df %>%  filter_at(vars(starts_with("Mac")), all_vars(. != 0))
    

    data

    df <- structure(list(DateTime = structure(1:6, .Label = c("2011-04-02 06:00", 
    "2011-04-02 06:05", "2011-04-02 06:10", "2011-04-02 06:15", "2011-04-02 06:20", 
    "2011-04-02 06:25"), class = "factor"), Mac1 = c(20L, 21L, 22L, 
    23L, 24L, 0L), Mac2 = c(0L, 21L, 22L, 23L, 24L, 25L), Mac3 = c(20L, 
    21L, 22L, 0L, 24L, 25L), Mac4 = c(20L, 21L, 22L, 23L, 24L, 0L
    )), class = "data.frame", row.names = c(NA, -6L))
    
    0 讨论(0)
  • 2020-11-28 07:41

    Well, you could swap your 0's for NA and then use one of those solutions, but for sake of a difference, you could notice that a number will only have a finite logarithm if it is greater than 0, so that rowSums of the log will only be finite if there are no zeros in a row.

    dfr[is.finite(rowSums(log(dfr[-1]))),]
    
    0 讨论(0)
  • 2020-11-28 07:47

    There are a few different ways of doing this. I prefer using apply, since it's easily extendable:

    ##Generate some data
    dd = data.frame(a = 1:4, b= 1:0, c=0:3)
    
    ##Go through each row and determine if a value is zero
    row_sub = apply(dd, 1, function(row) all(row !=0 ))
    ##Subset as usual
    dd[row_sub,]
    
    0 讨论(0)
提交回复
热议问题