The difference of na.rm and na.omit in R

后端 未结 2 1417
野的像风
野的像风 2021-02-08 22:55

I\'ve just started with R and I\'ve executed these statements:

library(datasets)
head(airquality)
s <- split(airquality,airquality$Month)
sapply(s, function(x         


        
2条回答
  •  时光取名叫无心
    2021-02-08 23:12

    They are not supposed to give the same result. Consider this example:

    exdf<-data.frame(a=c(1,NA,5),b=c(3,2,2))
    #   a b
    #1  1 3
    #2 NA 2
    #3  5 2
    colMeans(exdf,na.rm=TRUE)
    #       a        b 
    #3.000000 2.333333
    colMeans(na.omit(exdf))
    #  a   b 
    #3.0 2.5
    

    Why is this? In the first case, the mean of column b is calculated through (3+2+2)/3. In the second case, the second row is removed in its entirety (also the value of b which is not-NA and therefore considered in the first case) by na.omit and so the b mean is just (3+2)/2.

提交回复
热议问题