问题
I have a dataset containing missing values. I have imputed this dataset, as follows:
library(mice)
id <- c(1,2,3,4,5,6,7,8,9,10)
group <- c(0,1,1,0,1,1,0,1,0,1)
measure_1 <- c(60,80,90,54,60,61,77,67,88,90)
measure_2 <- c(55,NA,88,55,70,62,78,66,65,92)
measure_3 <- c(58,88,85,56,68,62,89,62,70,99)
measure_4 <- c(64,80,78,92,NA,NA,87,65,67,96)
measure_5 <- c(64,85,80,65,74,69,90,65,70,99)
measure_6 <- c(70,NA,80,55,73,64,91,65,91,89)
dat <- data.frame(id, group, measure_1, measure_2, measure_3, measure_4, measure_5, measure_6)
dat$group <- as.factor(dat$group)
imp_anova <- mice(dat, maxit = 0)
meth <- imp_anova$method
pred <- imp_anova$predictorMatrix
imp_anova <- mice(dat, method = meth, predictorMatrix = pred, seed = 2018,
maxit = 10, m = 5)
This creates five imputed datasets. Then, I created the complete datasets (example dataset 1):
impute_1 <- mice::complete(imp_anova, 1) # complete set 1
And then I performed the desired analysis:
library(reshape)
library(reshape2)
datLong <- melt(impute_1, id = c("id", "group"), measure.vars = c("measure_1", "measure_2", "measure_3", "measure_4", "measure_5", "measure_6"))
colnames(datLong) <- c("ID", "Gender", "Time", "Value")
table(datLong$Time) # To check if correct
datLong$ID <- as.factor(datLong$ID)
library(ez)
model_mixed_1 <- ezANOVA(data = datLong,
dv = Value,
wid = ID,
within = Time,
between = Gender,
detailed = TRUE,
type = 3,
return_aov = TRUE)
I did this for all the five datasets, resulting in five models:
model_mixed_1
model_mixed_2
model_mixed_3
model_mixed_4
model_mixed_5
Now I want to combine the results of this models, to generate one results. I have asked a similar question before, but there I focused on the models. Here I just want to ask how I can simply combine five models. Hope someone can help me!
回答1:
You understood the basic multiple imputation process right. The process is like:
- First your create your m imputed datasets. (mice() - function)
- Then you do your analysis on each of these datasets. (with() - function)
- In the end you combine these results together. (pool() - function)
This is a quite often misunderstand process (often people assume you have to combine your m imputed datasets together to one dataset - which is wrong)
Here is a picture of this process:
Now you have to follow these steps within the mice framework - you did this only till step 1.
Here an excerpt from the mice help:
The pool() function combines the estimates from m repeated complete data analyses. The typical sequence of steps to do a multiple imputation analysis is:
Impute the missing data by the mice function, resulting in a multiple imputed data set (class mids);
Fit the model of interest (scientific model) on each imputed data set by the with() function, resulting an object of class mira;
Pool the estimates from each model into a single set of estimates and standard errors, resulting is an object of class mipo;
Optionally, compare pooled estimates from different scientific models by the pool.compare() function.
Code wise this can look for example like this:
imp <- mice(nhanes, maxit = 2, m = 5)
fit <- with(data=imp,exp=lm(bmi~age+hyp+chl))
summary(pool(fit))
来源:https://stackoverflow.com/questions/54075836/multiple-imputed-datasets-pooling-results