R: repeat linear regression for all variables and save results in a new data frame

前端 未结 3 984
醉话见心
醉话见心 2021-01-28 17:03

I have a data frame named “dat” with 10 numeric variables (var1, var2,var3,var4 , var5,…var 10), each with several observations…

dat

   var1 var2 var3 var4 var         


        
3条回答
  •  北荒
    北荒 (楼主)
    2021-01-28 17:41

    There a several ways to do what you want in R. I suggest sapply which is a simple way to apply a function other a list of variables. Here is an example to get the coefficients of each linear regression between var1 and all other variables.

    # define a function to get coefficients from linear regression
    do_lm <- function(var){ # var is the name of the column
      res <- lm(as.formula(paste0("var1~",var)), data = dat) # compute linear regression
      coefs <- c(intercept = res$coefficient[2], slope = res$coefficient[1]) # get coefficients
      return(coefs)
    }
    
    t(
      sapply(colnames(dat)[2:10], do_lm)
     )
    # t transposes the result 
    # sapply : applies on "var2" ... "var10" the function do_lm
    

    It returns :

          intercept.var2 slope.(Intercept)
    var2       0.5251232         6.4600985
    var3       0.8630573        -1.4968153
    var4       0.7660377         0.6490566
    var5      -0.5047619        14.8158730
    var6              NA        10.7777778
    var7              NA        10.7777778
    var8              NA        10.7777778
    var9              NA        10.7777778
    var10             NA        10.7777778
    

    You can adapt the function do_lm in sapply to compute other things, like correlations ...

提交回复
热议问题