mean-before-after imputation in R

孤街醉人 提交于 2019-12-02 02:59:56

问题


I'm new in R. My question is how to impute missing value using mean of before and after of the missing data point?

example;

using the mean from the upper and lower of each NA as the impute value.

-mean for row number 3 is 38.5

-mean for row number 7 is 32.5

age
52.0
27.0
NA
23.0
39.0
32.0
NA
33.0
43.0

Thank you.


回答1:


Here a solution using from na.locf from zoo package which replaces each NA with the most recent non-NA prior or posterior to it.

0.5*(na.locf(x,fromlast=TRUE) + na.locf(x))
[1] 52.0 27.0 25.0 23.0 39.0 32.0 32.5 33.0 43.0

the advantage here if you have more than one consecutive NA.

x <- c(52, 27, NA, 23, 39, NA, NA, 33, 43)
0.5*(na.locf(x,fromlast=TRUE) + na.locf(x))
[1] 52 27 25 23 39 36 36 33 43

EDIT rev argument is deprecated so I replace it by fromlast




回答2:


This would be a basic manual approach you can take:

age <- c(52, 27, NA, 23, 39, 32, NA, 33, 43)
age[is.na(age)] <- rowMeans(cbind(age[which(is.na(age))-1], 
                                  age[which(is.na(age))+1]))
age
# [1] 52.0 27.0 25.0 23.0 39.0 32.0 32.5 33.0 43.0

Or, since you seem to have a single column data.frame:

mydf <- data.frame(age = c(52, 27, NA, 23, 39, 32, NA, 33, 43))

mydf[is.na(mydf$age), ] <- rowMeans(
  cbind(mydf$age[which(is.na(mydf$age))-1],
        mydf$age[which(is.na(mydf$age))+1]))



回答3:


Just an other way:

age <- c(52, 27, NA, 23, 39, 32, NA, 33, 43)
age[is.na(age)] <- apply(sapply(which(is.na(age)), "+", c(-1, 1)), 2, 
                         function(x) mean(age[x]))
age
## [1] 52.0 27.0 25.0 23.0 39.0 32.0 32.5 33.0 43.0



回答4:


You are looking for Moving Average Imputation - you can use the na.ma function of imputeTS for this.

library(imputeTS)
x <- c(52, 27, NA, 23, 39, NA, NA, 33, 43)
na.ma(x, k=1, weighting = "simple")

[1] 52.00000 27.00000 25.00000 23.00000 39.00000 31.66667 38.33333 33.00000 43.00000

This produces exactly the required result. With the k parameter you specify how many neighbors on each side are taken into account for the calculation.



来源:https://stackoverflow.com/questions/15308205/mean-before-after-imputation-in-r

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!