how to use loop to do linear regression in R

依然范特西╮ 提交于 2021-02-08 12:13:40

问题


I wonder if I can use such as for loop or apply function to do the linear regression in R. I have a data frame containing variables such as crim, rm, ad, wd. I want to do simple linear regression of crim on each of other variable.

Thank you!


回答1:


If you really want to do this, it's pretty trivial with lapply(), where we use it to "loop" over the other columns of df. A custom function takes each variable in turn as x and fits a model for that covariate.

df <- data.frame(crim = rnorm(20), rm = rnorm(20), ad = rnorm(20), wd = rnorm(20))

mods <- lapply(df[, -1], function(x, dat) lm(crim ~ x, data = dat))

mods is now a list of lm objects. The names of mods contains the names of the covariate used to fit the model. The main negative of this is that all the models are fitted using a variable x. More effort could probably solve this, but I doubt that effort is worth the time.

If you are just selecting models, which may be dubious, there are other ways to achieve this. For example via the leaps package and its regsubsets function:

library("leapls")
a <- regsubsets(crim ~ ., data = df, nvmax = 1, nbest = ncol(df) - 1)
summa <- summary(a)

Then plot(a) will show which of the models is "best", for example.

Original

If I understand what you want (crim is a covariate and the other variables are the responses you want to predict/model using crim), then you don't need a loop. You can do this using a matrix response in a standard lm().

Using some dummy data:

df <- data.frame(crim = rnorm(20), rm = rnorm(20), ad = rnorm(20), wd = rnorm(20))

we create a matrix or multivariate response via cbind(), passing it the three response variables we're interested in. The remaining parts of the call to lm are entirely the same as for a univariate response:

mods <- lm(cbind(rm, ad, wd) ~ crim, data = df)
mods 

> mods

Call:
lm(formula = cbind(rm, ad, wd) ~ crim, data = df)

Coefficients:
             rm        ad        wd      
(Intercept)  -0.12026  -0.47653  -0.26419
crim         -0.26548   0.07145   0.68426

The summary() method produces a standard summary.lm output for each of the responses.




回答2:


Suppose you want to have response variable fix as first column of your data frame and you want to run simple linear regression multiple times individually with other variable keeping first variable fix as response variable.

h=iris[,-5]

for (j in 2:ncol(h)){
  assign(paste("a", j, sep = ""),lm(h[,1]~h[,j]))
}

Above is the code which will create multiple list of regression output and store it in a2,a3,....



来源:https://stackoverflow.com/questions/37314006/how-to-use-loop-to-do-linear-regression-in-r

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!