I\'ve just started with R and I\'ve executed these statements:
library(datasets)
head(airquality)
s <- split(airquality,airquality$Month)
sapply(s, function(x
sapply(s, function(x) {colMeans(x[,c("Ozone", "Solar.R", "Wind")], na.rm = TRUE)})
treats each column individually, and calculates the average of the non-NA values in each column.
lapply(s, function(x) {colMeans(na.omit(x[,c("Ozone", "Solar.R", "Wind")])) })
subsets s
to those cases where none of the three columns are NA
, and then takes the column means for the resulting data.
The difference comes from those rows which have one or two of the values as NA
.