Issue with NA values in R

前端 未结 3 853
时光取名叫无心
时光取名叫无心 2021-01-22 07:44

I feel this should be something easy, I have looked x the internet, but I keep getting error messages. I have done plenty of analytics in the past but am new to R and programmin

相关标签:
3条回答
  • 2021-01-22 07:56

    We can include the na.rm = TRUE in mean

    columnmean <-function(y){
      nc <- ncol(y)
      means <- numeric(nc)
      for(i in 1:nc) {
        means[i] <- mean(y[,i], na.rm = TRUE)
      }
       means 
    }
    

    If we need to use na.rm argument sometimes as FALSE and other times as TRUE, then specify that in the argument of 'columnmean'

    columnmean <-function(y, ...){
        nc <- ncol(y)
      means <- numeric(nc)
       for(i in 1:nc) {
         means[i] <- mean(y[,i], ...)
       }
       means 
      }
    
    columnmean(df1, na.rm = TRUE)
    #[1] 1.5000000 0.3333333
     columnmean(df1, na.rm = FALSE)
    #[1] 1.5  NA
    

    data

     df1 <- structure(list(num = c(1L, 1L, 2L, 2L), x1 = c(1L, NA, 0L, 0L
     )), .Names = c("num", "x1"), row.names = c(NA, -4L), class = "data.frame")
    
    0 讨论(0)
  • 2021-01-22 08:00

    You can pass the parameter na.rm to your function:

    columnmean <- function(y, na.rm = FALSE){
      nc <- ncol(y)
      means <- numeric(nc)
      for(i in 1:nc) {
        means[i] <- mean(y[,i], na.rm = na.rm)
      }
      means 
    }
    
    data("airquality")
    columnmean(airquality, na.rm = TRUE)
    #[1] 42.129310 185.931507   9.957516  77.882353   6.993464  15.803922
    
    columnmean(airquality)
    #[1]        NA        NA  9.957516 77.882353  6.993464 15.803922
    

    But my recommendation is to look for an alternate code to loops:

    column_mean <- function(y, na.rm = FALSE) {
      sapply(y, function(x) mean(x, na.rm = na.rm))
    }
    
    column_mean(airquality, na.rm = TRUE)
    #     Ozone    Solar.R       Wind       Temp      Month        Day 
    # 42.129310 185.931507   9.957516  77.882353   6.993464  15.803922
    
    0 讨论(0)
  • 2021-01-22 08:01

    You should be using that parameter in the mean function call:

    columnmean <-function(y){
      nc <- ncol(y)
      means <- numeric(nc)
      for(i in 1:nc) {
        means[i] <- mean(y[,i], na.rm = TRUE)
      }
        means 
    }
    

    columnmean is a custom function and does not have that parameter.

    0 讨论(0)
提交回复
热议问题