I am trying to get the marginal effects, according to this post: http://andrewgelman.com/2016/01/14/rstanarm-and-more/
td <- readRDS("some data")
CHAINS <- 1
CORES <- 1
SEED <- 42
ITERATIONS <- 2000
MAX_TREEDEPTH <- 9
md <- td[,.(y,x1,x2)] # selection the columns i need. y is binary
glm1 <- stan_glm(y~x1+x2,
data = md,
family = binomial(link="logit"),
prior = NULL,
prior_intercept = NULL,
chains = CHAINS,
cores = CORES,
seed = SEED,
iter = ITERATIONS,
control=list(max_treedepth=MAX_TREEDEPTH)
)
# launch_shinystan(glm1)
tmp <- posterior_predict(glm1,newdata=md[,.(x1,x2)])
Issue
After running this code i get the following error:
I get an error that y
not found, which actually means that i also need to pass y
in the newdata
, which it shouldn't be the case according to ?posterior_predict
Reasoning
I need tmp <- posterior_predict(glm1,newdata=md[,.(x1,x2)])
because according to the post above (as far as i understand), in order to calculate the marginal effect of x1 (if i assume that x1 is binary) would be
temp <- md
temp[,x1:=0]
temp[,x2:=mean(x2)]
number_0 <- posterior_predict(glm1,newdata=temp)
temp <- md
temp[,x1:=1]
temp[,x2:=mean(x2)]
number_1 <- posterior_predict(glm1,newdata=temp)
marginal_effect_x1 <- number_1 - number_0
For a binary logit model, the marginal effect of a continuous variable is the derivative of the probability of success with respect to that variable, which by the chain rule is the logistic density (evaluated at some values of the predictors, usually the observed values of the predictors) multiplied by the coefficient of the variable in question. In your case, that would be
df <- as.data.frame(glm1)
ME <- df$x2 * dlogis(posterior_linpred(glm1))
Since this depends on the observed values of the predictors, it is common to average over the data with
AME <- rowMeans(ME)
In the case of a binary predictor, you can just subtract the probability of success when x1 = 0
from the probability of success when x1 = 1
via
nd <- md
nd$x1 <- 0
p0 <- posterior_linpred(glm1, newdata = nd, transform = TRUE)
nd$x1 <- 1
p1 <- posterior_linpred(glm1, newdata = nd, transform = TRUE)
ME <- p1 - p0
AME <- rowMeans(ME)
来源:https://stackoverflow.com/questions/45037485/calculating-marginal-effects-in-binomial-logit-using-rstanarm