Confidence intervals for predictions from logistic regression

后端 未结 1 1187
醉梦人生
醉梦人生 2020-12-22 17:22

In R predict.lm computes predictions based on the results from linear regression and also offers to compute confidence intervals for these predictions. According to the manu

相关标签:
1条回答
  • 2020-12-22 17:45

    The usual way is to compute a confidence interval on the scale of the linear predictor, where things will be more normal (Gaussian) and then apply the inverse of the link function to map the confidence interval from the linear predictor scale to the response scale.

    To do this you need two things;

    1. call predict() with type = "link", and
    2. call predict() with se.fit = TRUE.

    The first produces predictions on the scale of the linear predictor, the second returns the standard errors of the predictions. In pseudo code

    ## foo <- mtcars[,c("mpg","vs")]; names(foo) <- c("x","y") ## Working example data
    mod <- glm(y ~ x, data = foo, family = binomial)
    preddata <- with(foo, data.frame(x = seq(min(x), max(x), length = 100)))
    preds <- predict(mod, newdata = preddata, type = "link", se.fit = TRUE)
    

    preds is then a list with components fit and se.fit.

    The confidence interval on the linear predictor is then

    critval <- 1.96 ## approx 95% CI
    upr <- preds$fit + (critval * preds$se.fit)
    lwr <- preds$fit - (critval * preds$se.fit)
    fit <- preds$fit
    

    critval is chosen from a t or z (normal) distribution as required (I forget exactly now which to use for which type of GLM and what the properties are) with the coverage required. The 1.96 is the value of the Gaussian distribution giving 95% coverage:

    > qnorm(0.975) ## 0.975 as this is upper tail, 2.5% also in lower tail
    [1] 1.959964
    

    Now for fit, upr and lwr we need to apply the inverse of the link function to them.

    fit2 <- mod$family$linkinv(fit)
    upr2 <- mod$family$linkinv(upr)
    lwr2 <- mod$family$linkinv(lwr)
    

    Now you can plot all three and the data.

    preddata$lwr <- lwr2 
    preddata$upr <- upr2 
    ggplot(data=foo, mapping=aes(x=x,y=y)) + geom_point() +         
       stat_smooth(method="glm", method.args=list(family=binomial)) + 
       geom_line(data=preddata, mapping=aes(x=x, y=upr), col="red") + 
       geom_line(data=preddata, mapping=aes(x=x, y=lwr), col="red") 
    

    0 讨论(0)
提交回复
热议问题