问题
I am really baffled about why my imputation is failing in R's Mice 2.22 package. I am attempting a very simple operation with the following data frame:
> dfn
a b c d
1 0 1 0 1
2 1 0 0 0
3 0 0 0 0
4 NA 0 0 0
5 0 0 0 NA
I then use mice in the following way to perform a simple mean imputation:
imp <- mice(dfn, method = "mean", m = 1, maxit =1)
filled <- complete(imp)
However, my completed data looks like this:
> fill
a b c d
1 0.00 1 0 1
2 1.00 0 0 0
3 0.00 0 0 0
4 0.25 0 0 0
5 0.00 0 0 NA
Why am I still getting this trailing NA? This is the simplest failing example I could construct, but my real data set is much larger and I am just trying to get a sense of where things are going wrong. Any help would be greatly appreciated!
回答1:
I'm not really sure how accurate this is, but here is an attempt. Even though method="mean"
is supposed to impute the unconditional mean, it appears from the documentation that the prdictorMatrix
is not being changed accordingly.
Normally, leftover NA
occur because the predictors suffer from multicollinearity or because there are too few cases per variable (such that the imputation model cannot be estimated).
However, method="mean"
shouldn't behave that way.
Here is what I did:
dfn <- read.table(text="a b c d
0 1 0 1
1 0 0 0
0 0 0 0
NA 0 0 0
0 0 0 NA", header=TRUE)
imp <- mice( dfn, method="mean", predictorMatrix=diag(ncol(dfn)) )
complete(imp)
# 1 0.00 1 0 1.00
# 2 1.00 0 0 0.00
# 3 0.00 0 0 0.00
# 4 0.25 0 0 0.00
# 5 0.00 0 0 0.25
You can try this using your actual data set, but you should check the results carefully. For example, do:
sapply(dfn, function(x) mean(x,na.rm=TRUE))
The means for each variable should be identical to those that have been imputed. Please let me know if this solves your problem.
来源:https://stackoverflow.com/questions/27351903/r-mice-imputation-failing