I am doing multiple OLS regressions. I have used the following lm function:
GroupNetReturnsStockPickers <- read.csv("GroupNetReturnsStockPickers.csv", header=TRUE, sep=",", dec=".")
ModelGroupNetReturnsStockPickers <- lm(StockPickersNet ~ Mkt.RF+SMB+HML+WML, data=GroupNetReturnsStockPickers)
names(GroupNetReturnsStockPickers)
summary(ModelGroupNetReturnsStockPickers)
Which gives me the summary output of:
Call:
lm(formula = StockPickersNet ~ Mkt.RF + SMB + HML + WML, data = GroupNetReturnsStockPickers)
Residuals:
Min 1Q Median 3Q Max
-0.029698 -0.005069 -0.000328 0.004546 0.041948
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 4.655e-05 5.981e-04 0.078 0.938
Mkt.RF -1.713e-03 1.202e-02 -0.142 0.887
SMB 3.006e-02 2.545e-02 1.181 0.239
HML 1.970e-02 2.350e-02 0.838 0.403
WML 1.107e-02 1.444e-02 0.766 0.444
Residual standard error: 0.009029 on 251 degrees of freedom
Multiple R-squared: 0.01033, Adjusted R-squared: -0.005445
F-statistic: 0.6548 on 4 and 251 DF, p-value: 0.624
This is perfect. However, I am doing a total of 10 multiple OLS regressions, and I wish to create my own summary output, in a data frame, where I extract the Intercept Estimate, the tvalue estimate, and the p-value, for all 10 analyzes individually. Hence it would be a 10x3, where the columns names would be Model1, Model2,..,Model10, and row names: Value, t-value and p-Value.
I appreciate any help.
There's a few packages that do this (stargazer and texreg) as well as this code for outreg.
In any case, if you are only interested in the intercept here is one approach:
# Estimate a bunch of different models, stored in a list
fits <- list() # Create empty list to store models
fits$model1 <- lm(Ozone ~ Solar.R, data = airquality)
fits$model2 <- lm(Ozone ~ Solar.R + Wind, data = airquality)
fits$model3 <- lm(Ozone ~ Solar.R + Wind + Temp, data = airquality)
# Combine the results for the intercept
do.call(cbind, lapply(fits, function(z) summary(z)$coefficients["(Intercept)", ]))
# RESULT:
# model1 model2 model3
# Estimate 18.598727772 7.724604e+01 -64.342078929
# Std. Error 6.747904163 9.067507e+00 23.054724347
# t value 2.756222869 8.518995e+00 -2.790841389
# Pr(>|t|) 0.006856021 1.052118e-13 0.006226638
Look at the broom
package, which was created to do exactly what you are asking for. The only difference is that it puts the models into rows and the different statistics into columns, and I understand that you would prefer the opposite, but you can work around that afterwards if it is really necessary.
To give you an example, the function tidy()
converts a model output into a dataframe.
model <- lm(mpg ~ cyl, data=mtcars)
summary(model)
Call:
lm(formula = mpg ~ cyl, data = mtcars)
Residuals:
Min 1Q Median 3Q Max
-4.9814 -2.1185 0.2217 1.0717 7.5186
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 37.8846 2.0738 18.27 < 2e-16 ***
cyl -2.8758 0.3224 -8.92 6.11e-10 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 3.206 on 30 degrees of freedom
Multiple R-squared: 0.7262, Adjusted R-squared: 0.7171
F-statistic: 79.56 on 1 and 30 DF, p-value: 6.113e-10
And
library(broom)
tidy(model)
yields the following data frame:
term estimate std.error statistic p.value
1 (Intercept) 37.88458 2.0738436 18.267808 8.369155e-18
2 cyl -2.87579 0.3224089 -8.919699 6.112687e-10
Look at ?tidy.lm
to see more options, for instance for confidence intervals, etc.
To combine the output of your ten models into one dataframe, you could use
library(dplyr)
bind_rows(one, two, three, ... , .id="models")
Or, if your different models come from regressions using the same dataframe, you can combine it with dplyr
:
models <- mtcars %>% group_by(gear) %>% do(data.frame(tidy(lm(mpg~cyl, data=.), conf.int=T)))
Source: local data frame [6 x 8]
Groups: gear
gear term estimate std.error statistic p.value conf.low conf.high
1 3 (Intercept) 29.783784 4.5468925 6.550360 1.852532e-05 19.960820 39.6067478
2 3 cyl -1.831757 0.6018987 -3.043297 9.420695e-03 -3.132080 -0.5314336
3 4 (Intercept) 41.275000 5.9927925 6.887440 4.259099e-05 27.922226 54.6277739
4 4 cyl -3.587500 1.2587382 -2.850076 1.724783e-02 -6.392144 -0.7828565
5 5 (Intercept) 40.580000 3.3238331 12.208796 1.183209e-03 30.002080 51.1579205
6 5 cyl -3.200000 0.5308798 -6.027730 9.153118e-03 -4.889496 -1.5105036
来源:https://stackoverflow.com/questions/35637535/summary-dataframe-from-several-multiple-regression-outputs