Add regression line equation and R^2 on graph

后端 未结 9 2214
梦如初夏
梦如初夏 2020-11-21 07:24

I wonder how to add regression line equation and R^2 on the ggplot. My code is:

library(ggplot2)

df <- data.frame(x = c(1:100))
df$y <- 2         


        
相关标签:
9条回答
  • 2020-11-21 07:49

    I've modified Ramnath's post to a) make more generic so it accepts a linear model as a parameter rather than the data frame and b) displays negatives more appropriately.

    lm_eqn = function(m) {
    
      l <- list(a = format(coef(m)[1], digits = 2),
          b = format(abs(coef(m)[2]), digits = 2),
          r2 = format(summary(m)$r.squared, digits = 3));
    
      if (coef(m)[2] >= 0)  {
        eq <- substitute(italic(y) == a + b %.% italic(x)*","~~italic(r)^2~"="~r2,l)
      } else {
        eq <- substitute(italic(y) == a - b %.% italic(x)*","~~italic(r)^2~"="~r2,l)    
      }
    
      as.character(as.expression(eq));                 
    }
    

    Usage would change to:

    p1 = p + geom_text(aes(x = 25, y = 300, label = lm_eqn(lm(y ~ x, df))), parse = TRUE)
    
    0 讨论(0)
  • 2020-11-21 07:54

    really love @Ramnath solution. To allow use to customize the regression formula (instead of fixed as y and x as literal variable names), and added the p-value into the printout as well (as @Jerry T commented), here is the mod:

    lm_eqn <- function(df, y, x){
        formula = as.formula(sprintf('%s ~ %s', y, x))
        m <- lm(formula, data=df);
        # formating the values into a summary string to print out
        # ~ give some space, but equal size and comma need to be quoted
        eq <- substitute(italic(target) == a + b %.% italic(input)*","~~italic(r)^2~"="~r2*","~~p~"="~italic(pvalue), 
             list(target = y,
                  input = x,
                  a = format(as.vector(coef(m)[1]), digits = 2), 
                  b = format(as.vector(coef(m)[2]), digits = 2), 
                 r2 = format(summary(m)$r.squared, digits = 3),
                 # getting the pvalue is painful
                 pvalue = format(summary(m)$coefficients[2,'Pr(>|t|)'], digits=1)
                )
              )
        as.character(as.expression(eq));                 
    }
    
    geom_point() +
      ggrepel::geom_text_repel(label=rownames(mtcars)) +
      geom_text(x=3,y=300,label=lm_eqn(mtcars, 'hp','wt'),color='red',parse=T) +
      geom_smooth(method='lm')
    

    Unfortunately, this doesn't work with facet_wrap or facet_grid.

    0 讨论(0)
  • 2020-11-21 07:57

    Inspired by the equation style provided in this answer, a more generic approach (more than one predictor + latex output as option) can be:

    print_equation= function(model, latex= FALSE, ...){
        dots <- list(...)
        cc= model$coefficients
        var_sign= as.character(sign(cc[-1]))%>%gsub("1","",.)%>%gsub("-"," - ",.)
        var_sign[var_sign==""]= ' + '
    
        f_args_abs= f_args= dots
        f_args$x= cc
        f_args_abs$x= abs(cc)
        cc_= do.call(format, args= f_args)
        cc_abs= do.call(format, args= f_args_abs)
        pred_vars=
            cc_abs%>%
            paste(., x_vars, sep= star)%>%
            paste(var_sign,.)%>%paste(., collapse= "")
    
        if(latex){
            star= " \\cdot "
            y_var= strsplit(as.character(model$call$formula), "~")[[2]]%>%
                paste0("\\hat{",.,"_{i}}")
            x_vars= names(cc_)[-1]%>%paste0(.,"_{i}")
        }else{
            star= " * "
            y_var= strsplit(as.character(model$call$formula), "~")[[2]]        
            x_vars= names(cc_)[-1]
        }
    
        equ= paste(y_var,"=",cc_[1],pred_vars)
        if(latex){
            equ= paste0(equ," + \\hat{\\varepsilon_{i}} \\quad where \\quad \\varepsilon \\sim \\mathcal{N}(0,",
                        summary(MetamodelKdifEryth)$sigma,")")%>%paste0("$",.,"$")
        }
        cat(equ)
    }
    

    The model argument expects an lm object, the latex argument is a boolean to ask for a simple character or a latex-formated equation, and the ... argument pass its values to the format function.

    I also added an option to output it as latex so you can use this function in a rmarkdown like this:

    
    ```{r echo=FALSE, results='asis'}
    print_equation(model = lm_mod, latex = TRUE)
    ```
    

    Now using it:

    df <- data.frame(x = c(1:100))
    df$y <- 2 + 3 * df$x + rnorm(100, sd = 40)
    df$z <- 8 + 3 * df$x + rnorm(100, sd = 40)
    lm_mod= lm(y~x+z, data = df)
    
    print_equation(model = lm_mod, latex = FALSE)
    

    This code yields: y = 11.3382963933174 + 2.5893419 * x + 0.1002227 * z

    And if we ask for a latex equation, rounding the parameters to 3 digits:

    print_equation(model = lm_mod, latex = TRUE, digits= 3)
    

    This yields:

    0 讨论(0)
提交回复
热议问题