Linear Regression prediction in R using Leave One out Approach

ぃ、小莉子 提交于 2021-02-18 19:51:30

问题


I have 3 linear regression models built using the mtcars and would like to use those models to generate predictions for each rows of the mtcars tables. Those predictions should be added as additional columns (3 additional columns) of the mtcars dataframe and should be generated in a for loop using the leave one out approach. Furthermore predictions for model1 and model2 should be performed by "grouping" the cyl numbers whiles predictions made with the model 3 should be accomplished without doing any grouping.

So far I've been able to somewhat get something with a single model in the loop:

model1 =lm(hp ~ mpg, data = mtcars)

model2 =lm(hp ~ mpg + hp, data = mtcars)

model3 =lm(hp ~ mpg + hp + wt, data = mtcars)

fitted_value <- NULL

for(i in 1:nrow(mtcars)){
  

  validation<-mtcars[i,]

  training<-mtcars[-i,]

  model1<-lm(mpg ~ hp, data = training)

  fitted_value[i] <-predict(model1, newdata = validation)

   }```


I would like to be able to generate all the model predictions by first putting all the models in a list or vector and attaching the result to the mtcars dataframe. Somthing lke thislike this:

```model1 =lm(hp ~ mpg, data = mtcars)

model2 =lm(hp ~ mpg + hp, data = mtcars)

model3 =lm(hp ~ mpg + hp + wt, data = mtcars)

models <- list(model1, model2, model3)

fitted_value <- NULL

for(i in 1:nrow(mtcars)){
  

  validation<-mtcars[i,]

  training<-mtcars[-i,]

  fitted_value[i] <-predict(models, newdata = validation)

   }```

Thank you for you help

回答1:


You can use a nested map to fit each of the three formulas for each row i. Then just bind_cols with mtcars to attach the predictions.

library(tidyverse)

frml_1 <- as.formula("hp ~ mpg")
frml_2 <- as.formula("hp ~ mpg + drat")
frml_3 <- as.formula("hp ~ mpg + drat + wt")
frmls <- list(frml_1 = frml_1, frml_2 = frml_2, frml_3 = frml_3)

mtcars %>%
  bind_cols(
    map(1:nrow(mtcars), function(i) {
      map_dfc(frmls, function(frml) {
        training <- mtcars[-i, ]
        fit <- lm(frml, data = training)
        
        validation <- mtcars[i, ]
        predict(fit, newdata = validation)
      })
    }) %>%
    bind_rows()
  )

                     mpg cyl  disp  hp drat    wt  qsec vs am gear carb    frml_1    frml_2    frml_3
Mazda RX4           21.0   6 160.0 110 3.90 2.620 16.46  0  1    4    4 138.65796 138.65796 140.61340
Mazda RX4 Wag       21.0   6 160.0 110 3.90 2.875 17.02  0  1    4    4 138.65796 138.65796 139.55056
Datsun 710          22.8   4 108.0  93 3.85 2.320 18.61  1  1    4    1 122.76445 122.76445 124.91348
Hornet 4 Drive      21.4   6 258.0 110 3.08 3.215 19.44  1  0    3    1 135.12607 135.12607 134.36670
Hornet Sportabout   18.7   8 360.0 175 3.15 3.440 17.02  0  0    3    2 158.96634 158.96634 158.85438
Valiant             18.1   6 225.0 105 2.76 3.460 20.22  1  0    3    1 164.26418 164.26418 164.42112
Duster 360          14.3   8 360.0 245 3.21 3.570 15.84  0  0    3    4 197.81716 197.81716 199.74665
...

Note that the formulas have hp removed from RHS, as hp is also the response. I used drat instead for demonstration purposes.




回答2:


I been able to accomplish that by doing the following script:

fitted_value1 <- NULL
fitted_value2 <- NULL
fitted_value3 <- NULL

for(i in 1:nrow(mtcars)){
  validation<-mtcars[i,] 
  training<-mtcars[-i,]
  model1 =lm(hp ~ mpg, data = training)
  model2 =lm(hp ~ mpg + hp, data = training)
  model3 =lm(hp ~ mpg + hp + wt, data = training)
  fitted_value1[i] <-predict(model1, newdata = validation)
  fitted_value2[i] <-predict(model2, newdata = validation)
  fitted_value3[i] <-predict(model3, newdata = validation)
  res<- as.data.frame(cbind(mtcars,fitted_value1,fitted_value2,fitted_value3))
}

How can I improve this code? I would like to take the models out of the loop, save them as a list and only refer to the list inside the loop. This is more or less what I would ideally want (but it's not working):

model1 =lm(hp ~ mpg, data = mtcars)
model2 =lm(hp ~ mpg + hp, data = mtcars)
model3 =lm(hp ~ mpg + hp + wt, data = mtcars)
models <- list(model1, model2, model3)

fitted_value <- NULL

for(i in 1:nrow(mtcars)){
  for (j in models){

    validation<-mtcars[i,]
    training<-mtcars[-i,]
    fitted_value[i] <-predict(models[j], newdata = validation)

    # this should save the predictions for all the models and append it to the original dataframe
    df <- cbind(mtcars,fitted_value) 
  }
}

Thank you for your help



来源:https://stackoverflow.com/questions/64905459/linear-regression-prediction-in-r-using-leave-one-out-approach

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!