I feel this should be something easy, I have looked x the internet, but I keep getting error messages. I have done plenty of analytics in the past but am new to R and programmin
We can include the na.rm = TRUE
in mean
columnmean <-function(y){
nc <- ncol(y)
means <- numeric(nc)
for(i in 1:nc) {
means[i] <- mean(y[,i], na.rm = TRUE)
}
means
}
If we need to use na.rm
argument sometimes as FALSE and other times as TRUE, then specify that in the argument of 'columnmean'
columnmean <-function(y, ...){
nc <- ncol(y)
means <- numeric(nc)
for(i in 1:nc) {
means[i] <- mean(y[,i], ...)
}
means
}
columnmean(df1, na.rm = TRUE)
#[1] 1.5000000 0.3333333
columnmean(df1, na.rm = FALSE)
#[1] 1.5 NA
df1 <- structure(list(num = c(1L, 1L, 2L, 2L), x1 = c(1L, NA, 0L, 0L
)), .Names = c("num", "x1"), row.names = c(NA, -4L), class = "data.frame")
You can pass the parameter na.rm
to your function:
columnmean <- function(y, na.rm = FALSE){
nc <- ncol(y)
means <- numeric(nc)
for(i in 1:nc) {
means[i] <- mean(y[,i], na.rm = na.rm)
}
means
}
data("airquality")
columnmean(airquality, na.rm = TRUE)
#[1] 42.129310 185.931507 9.957516 77.882353 6.993464 15.803922
columnmean(airquality)
#[1] NA NA 9.957516 77.882353 6.993464 15.803922
But my recommendation is to look for an alternate code to loops:
column_mean <- function(y, na.rm = FALSE) {
sapply(y, function(x) mean(x, na.rm = na.rm))
}
column_mean(airquality, na.rm = TRUE)
# Ozone Solar.R Wind Temp Month Day
# 42.129310 185.931507 9.957516 77.882353 6.993464 15.803922
You should be using that parameter in the mean
function call:
columnmean <-function(y){
nc <- ncol(y)
means <- numeric(nc)
for(i in 1:nc) {
means[i] <- mean(y[,i], na.rm = TRUE)
}
means
}
columnmean
is a custom function and does not have that parameter.