summary dataframe from several multiple regression outputs

荒凉一梦 提交于 2019-11-29 08:51:09

There's a few packages that do this (stargazer and texreg) as well as this code for outreg.

In any case, if you are only interested in the intercept here is one approach:

# Estimate a bunch of different models, stored in a list
fits <- list() # Create empty list to store models
fits$model1 <- lm(Ozone ~ Solar.R, data = airquality)
fits$model2 <- lm(Ozone ~ Solar.R + Wind, data = airquality)
fits$model3 <- lm(Ozone ~ Solar.R + Wind + Temp, data = airquality)

# Combine the results for the intercept
do.call(cbind, lapply(fits, function(z) summary(z)$coefficients["(Intercept)", ]))


# RESULT:
#                  model1       model2        model3
# Estimate   18.598727772 7.724604e+01 -64.342078929
# Std. Error  6.747904163 9.067507e+00  23.054724347
# t value     2.756222869 8.518995e+00  -2.790841389
# Pr(>|t|)    0.006856021 1.052118e-13   0.006226638

Look at the broom package, which was created to do exactly what you are asking for. The only difference is that it puts the models into rows and the different statistics into columns, and I understand that you would prefer the opposite, but you can work around that afterwards if it is really necessary.

To give you an example, the function tidy() converts a model output into a dataframe.

model <- lm(mpg ~ cyl, data=mtcars)
summary(model) 

Call:
lm(formula = mpg ~ cyl, data = mtcars)

Residuals:
    Min      1Q  Median      3Q     Max 
-4.9814 -2.1185  0.2217  1.0717  7.5186 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  37.8846     2.0738   18.27  < 2e-16 ***
cyl          -2.8758     0.3224   -8.92 6.11e-10 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 3.206 on 30 degrees of freedom
Multiple R-squared:  0.7262,    Adjusted R-squared:  0.7171 
F-statistic: 79.56 on 1 and 30 DF,  p-value: 6.113e-10

And

 library(broom)
 tidy(model)

yields the following data frame:

         term estimate std.error statistic      p.value
1 (Intercept) 37.88458 2.0738436 18.267808 8.369155e-18
2         cyl -2.87579 0.3224089 -8.919699 6.112687e-10

Look at ?tidy.lm to see more options, for instance for confidence intervals, etc.

To combine the output of your ten models into one dataframe, you could use

library(dplyr)
bind_rows(one, two, three, ... , .id="models")

Or, if your different models come from regressions using the same dataframe, you can combine it with dplyr:

models <- mtcars %>% group_by(gear) %>% do(data.frame(tidy(lm(mpg~cyl, data=.), conf.int=T)))

Source: local data frame [6 x 8]
Groups: gear

  gear        term  estimate std.error statistic      p.value  conf.low  conf.high
1    3 (Intercept) 29.783784 4.5468925  6.550360 1.852532e-05 19.960820 39.6067478
2    3         cyl -1.831757 0.6018987 -3.043297 9.420695e-03 -3.132080 -0.5314336
3    4 (Intercept) 41.275000 5.9927925  6.887440 4.259099e-05 27.922226 54.6277739
4    4         cyl -3.587500 1.2587382 -2.850076 1.724783e-02 -6.392144 -0.7828565
5    5 (Intercept) 40.580000 3.3238331 12.208796 1.183209e-03 30.002080 51.1579205
6    5         cyl -3.200000 0.5308798 -6.027730 9.153118e-03 -4.889496 -1.5105036
标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!