Time lag analysis on list of imputed datasets

问题

My question and data is similar to the post in: Loop Through Data with Sequential Time Lags output Linear Regression Coefficients

set.seed(242)
df<- data.frame(month=order(seq(1,248,1),decreasing=TRUE), 
psit=c(79,1, NA, 69, 66, 77, 76, 93,  NA, 65 ,NA ,3, 45, 64, 88, 88 
,76, NA, NA, 85,sample(1:10,228, replace=TRUE)),var=sample(1:10,248, 
replace=TRUE))

However, the structure of my dataset differs because I have imputed missing values for psit. Now psit, month and var are now nested within a list tempdata after using the mice() function to impute values. Now tempdata includes 40 new imputed datasets.

tempdata<-mice(data = df, m = 40, method = "pmm", maxit 
 = 50, seed = 500)

I want to take the 40 imputed datasets, run the same time lag analysis on each imputed dataset (this differs from the post above where there was one dataset to preform the time lag analysis) and pool the R-squared values of each like time lag among all imputed datasets.

Posts on mice indicate you can pool the results of a lm() using:

modelFit1 <- with(tempdata,lm(psit~ month))
summary(pool(modelFit1))

However, I want to pool the R-squared values for like time lags among all 40 imputed datasets. So I am unsure how to use the dyn$lm() function on each imputed dataset in tempdata and then use the pool() function to pool results for the squared values.

To achieve that result. I have tried the following but get an error:

modelFit1 <- with(tempData, lapply(1:236, function(i) dyn$lm(psit ~ 
             lag(var, -i),tail(z, 12+i))))
summary(pool(modelFit1),function(x) summary(x)$r.squared))

回答1:

Since you are using mice package, wouldn't "pool.r.squared" works for your purpose?

pool.r.squared(modelFit1, adjusted = FALSE)
# est      lo 95    hi 95       fmi
# R^2 0.1345633 0.06061036 0.226836 0.1195257

来源：https://stackoverflow.com/questions/46395927/time-lag-analysis-on-list-of-imputed-datasets

标签

time-series

r-mice