Marginal effects with survey weights and multiple imputations

问题

I am working with survey data that use probability weights and multiple imputations. I would like to get marginal effects after estimating a logit model using the imputed data sets and the survey weights. I cannot figure out how to do this in R. Stata has the package mimrgns which makes it pretty easy. There is also this article (pdf) and supplementary material (pdf) that gives some direction, but I can't seem to apply it to my situation.

In the following example, please assume I already imputed "income" across three data sets (i.e., df1, df2, and df3). I would like to predict "gender" using employment status (i.e., working) and "income."

Here is a reproducible example.

library(tibble)
library(survey)
library(mitools)
library(ggeffects)

# Data set 1
# Note that I am excluding the "income" variable from the "df"s and creating  
# it separately so that it varies between the data sets. This simulates the 
# variation with multiple imputation. Since I am using the same seed
# (i.e., 123), all the other variables will be the same, the only one that 
# will vary will be "income."

set.seed(123)

df1 <- tibble(id      = seq(1, 100, by = 1),
              gender  = as.factor(rbinom(n = 100, size = 1, prob = 0.50)),
              working = as.factor(rbinom(n = 100, size = 1, prob = 0.40)),
              pweight = sample(50:500, 100,  replace   = TRUE))


# Create random income variable.

set.seed(456)

income <- tibble(income = sample(0:100000, 100))

# Bind it to df1

df1 <- cbind(df1, income)


# Data set 2

set.seed(123)

df2 <- tibble(id      = seq(1, 100, by = 1),
              gender  = as.factor(rbinom(n = 100, size = 1, prob = 0.50)),
              working = as.factor(rbinom(n = 100, size = 1, prob = 0.40)),
              pweight = sample(50:500, 100,  replace   = TRUE))

set.seed(789)

income <- tibble(income = sample(0:100000, 100))

df2 <- cbind(df2, income)


# Data set 3

set.seed(123)

df3 <- tibble(id      = seq(1, 100, by = 1),
              gender  = as.factor(rbinom(n = 100, size = 1, prob = 0.50)),
              working = as.factor(rbinom(n = 100, size = 1, prob = 0.40)),
              pweight = sample(50:500, 100,  replace   = TRUE))

set.seed(101)

income <- tibble(income = sample(0:100000, 100))

df3 <- cbind(df3, income)


# Apply weights via svydesign

imputation <- svydesign(id      = ~id,
                        weights = ~pweight,
                        data    = imputationList(list(df1, 
                                                      df2, 
                                                      df3)))


# Logit model with weights and imputations

logitImp <- with(imputation, svyglm(gender ~ working + income,
                             family = binomial()))


# Combine results across MI datasets

summary(MIcombine(logitImp))

Normally I would use library(ggeffects) to get marginal effects, but I get the following error when I try with the imputed data Error in class(model) <- "lmerMod" : attempt to set an attribute on NULL. Here is an example of how I would do it without the imputation, using "df1" as the data set.

# Create new svydesign variable

noImp <- svydesign(id      = ~id,
                   weights = ~pweight, 
                   data    = df1)


# Run model

logit <- svyglm(gender ~ working + income,
                family = binomial,
                design = noImp,
                data   = df1)


# Get marginal effects at the mean

ggpredict(logit, term = "working")

Any idea how to do this with with multiple imputation?

来源：https://stackoverflow.com/questions/48506315/marginal-effects-with-survey-weights-and-multiple-imputations

标签

survey

imputation