Odds ratios instead of logits in stargazer() LaTeX output

后端 未结 4 1875
感情败类
感情败类 2020-12-13 21:00

When using stargazer to create a LaTeX table on a logistic regression object the standard behaviour is to output logit-values of each model. Is it possible to get exp(logit)

相关标签:
4条回答
  • 2020-12-13 21:23

    There are pieces of the right answer across the various posts, but none of them seem to put it all together. Assuming the following:

    glm_out <- glm(Y ~ X, data=DT, family = "binomial")

    Getting the Odds-Ratio

    For a logistic regression, the regression coefficient (b1) is the estimated increase in the log odds of Y per unit increase in X. So, to get the odds-ratio, we just use the exp function:

    OR <- exp(coef(glm_out))
    
    # pass in coef directly
    stargazer(glm_out, coef = list(OR), t.auto=F, p.auto=F)
    
    # or, use the apply.coef option
    stargazer(glm_out, apply.coef = exp, t.auto=F, p.auto=F)
    

    Getting the Standard Error of the Odds-Ratio

    You cannot simply use apply.se = exp to get the Std. Error for the Odds Ratio

    Instead, you have to use the function: Std.Error.OR = OR * SE(coef)

    # define a helper function to extract SE from glm output
    se.coef <- function(glm.output){sqrt(diag(vcov(glm.output)))}
    
    # or, you can use the arm package
    se.coef <- arm::se.coef
    
    #Get the odds ratio
    OR <- exp(coef(glm_out))
    
    # Then, we can get the `StdErr.OR` by multiplying the two:
    Std.Error.OR <-  OR * se.coef(glm_out)
    

    So, to get it into stargazer, we use the following:

    # using Std Errors
    stargazer(glm_out, coef=list(OR), se = list(Std.Error.OR), t.auto=F, p.auto=F)
    

    Computing CIs for the Odds-Ratio

    Confidence intervals in an odds-ratio setting are not symmetric. So, we cannot just do ±1.96*SE(OR) to get the CI. Instead, we can compute it from the original log odds exp(coef ± 1.96*SE).

    # Based on normal distribution to compute Wald CIs:
    # we use confint.default to obtain the conventional confidence intervals
    # then, use the exp function to get the confidence intervals
    
    CI.OR <- as.matrix(exp(confint.default(glm_out)))
    

    So, to get it into stargazer, we use the following:

    # using ci.custom
    stargazer(glm_out, coef=list(OR), ci.custom = list(CI.OR), t.auto=F, p.auto=F, ci = T)
    
    # using apply.ci
    stargazer(glm_out, apply.coef = exp, apply.ci = exp, t.auto=F, p.auto=F, ci = T)
    

    NOTE ABOUT USING CONFIDENCE INTERVALS FOR SIGNIFICANCE TESTS:

    Do not use the Confidence Intervals of Odds Ratios to compute significance (see note and reference at the bottom). Instead, you can do it using the log odds:

    z <- coef(glm_out)/se.coef(glm_out)
    

    And, use that to get the p.values for significance tests:

    pvalue <- 2*pnorm(abs(coef(glm_out)/se.coef(glm_out)), lower.tail = F)
    

    (source: https://data.princeton.edu/wws509/r/c3s1)

    See this link for more detailed discussion on statistical testing: https://stats.stackexchange.com/questions/144603/why-do-my-p-values-differ-between-logistic-regression-output-chi-squared-test

    It is important to note however, that unlike the p value, the 95% CI does not report a measure’s statistical significance. In practice, the 95% CI is often used as a proxy for the presence of statistical significance if it does not overlap the null value (e.g. OR=1). Nevertheless, it would be inappropriate to interpret an OR with 95% CI that spans the null value as indicating evidence for lack of association between the exposure and outcome. source: Explaining Odds Ratios

    0 讨论(0)
  • 2020-12-13 21:31

    So, the issue is that you want to display the (non-log) odds ratio, but keep the test statistics based on the underlying linear model. By default, when you use one of the "apply" methods, such as apply.coef = exp, stargazer will recalculate the t statistics and p values. We don't want that. Also, the standard errors are in the log basis, but we can't just exponentiate them. My preferred approach is to:

    1. exponentiate the coefs in stargazer
    2. turn off the auto p and auto t
    3. report (untransformed) t-statistics in the table instead of standard errors

    In code, this is:

    stargazer(model, apply.coef=exp, t.auto=F, p.auto=F, report = "vct*")
    
    0 讨论(0)
  • 2020-12-13 21:33

    As per symbiotic comment in 2014, more recent versions of ''stargazer'' have the options ''apply.*'' for ''coef'' ''se'' ''t'' ''p'' and ''ci'' allowing the direct transformation of these statistics.

    apply.coef a function that will be applied to the coefficients.
    apply.se a function that will be applied to the standard errors.
    apply.t a function that will be applied to the test statistics.
    apply.p a function that will be applied to the p-values.
    apply.ci a function that will be applied to the lower and upper bounds of the confidence intervals.
    

    Meaning you can directly use...

    stargazer(model, 
              apply.coef = exp,
              apply.se   = exp)
    

    EDIT : I have noticed however that simply exponentiating the CIs does not give what you would expect.

    EDIT : You can obtain the correct CIs using the method described here.

    0 讨论(0)
  • 2020-12-13 21:39

    stargazer allows you to substitute a lot of things, dependent variable labels, covariate labels and so forth. To substitute those you need to supply a vector of variable labels, this is done to have publishable row names, instead of variable names from R by default.

    So in order to have odds ratios, you need to supply a vector of odds ratios to stargazer. How do you obtain that vector? Very easily, actually. Let's say that your model is called model, then your code is:

    coef.vector <- exp(model$coef)
    stargazer(model,coef=list(coef.vector))
    

    If you have multiple models in your table, then the list should be expanded, e.g. coef=list(coef.vector1,coef.vector2,...), where all vectors in the list would be derived from similar exponentiation as above.

    0 讨论(0)
提交回复
热议问题