问题
I'm migrating from Stata to R (plm package
) in order to do panel model econometrics. In Stata, panel models such as random effects usually report the within, between and overall R-squared.
I have found that the reported R-squared in the plm
Random Effects models corresponds to the within R squared. So, is there any way to get the overall and between R-squared using the plm package
in R?
See same example with R and Stata:
library(plm)
library(foreign) # read Stata files
download.file('http://fmwww.bc.edu/ec-p/data/wooldridge/wagepan.dta','wagepan.dta',mode="wb")
wagepan <- read.dta('wagepan.dta')
# Random effects
plm.re <- plm(lwage ~ educ + black + hisp + exper + expersq + married + union + d81 + d82 + d83 + d84 + d85 + d86 + d87,
data=wagepan,
model='random',
index=c('nr','year'))
summary(plm.re)
In Stata:
use http://fmwww.bc.edu/ec-p/data/wooldridge/wagepan.dta
xtset nr year
xtreg lwage educ black hisp exper expersq married union d81 d82 d83 d84 d85 d86 d87, re
The R-squared reported in R (0.18062) is, at least in this case, similar to the R-sq Within reported in Stata (0.1799). Is there any way to get in R the R-sq Between (0.1860) and overall (0.1830) reported in Stata?
回答1:
this website has the complete code to reproduce Example 14.4 in Wooldridge 2013 p. 494-5 with R-sq. reported for all models,
# install.packages(c("wooldridge"), dependencies = TRUE)
# devtools::install_github("JustinMShea/wooldridge")
library(wooldridge)
data(wagepan)
# install.packages(c("plm", "stargazer","lmtest"), dependencies = TRUE)
library(plm); library(lmtest); library(stargazer)
model <- as.formula("lwage ~ educ + black + hisp + exper+I(exper^2)+married + union+yr")
reg.ols <- plm(model, data = wagepan.p, model="pooling")
reg.re <- plm(lwage ~ educ + black + hisp + exper +
I(exper^2) + married + union + yr, data = wagepan.p, model="random")
reg.fe <- plm(lwage ~ I(exper^2) + married+union+yr, data=wagepan.p, model="within")
# Pretty table of selected results (not reporting year dummies)
stargazer(reg.ols,reg.re,reg.fe, type="text",
column.labels=c("OLS","RE","FE"),
keep.stat=c("n","rsq"),
keep=c("ed","bl","hi","exp","mar","un"))
which outputs,
#> ==========================================
#> Dependent variable:
#> -----------------------------
#> lwage
#> OLS RE FE
#> (1) (2) (3)
#> ------------------------------------------
#> educ 0.091*** 0.092***
#> (0.005) (0.011)
#>
#> black -0.139*** -0.139***
#> (0.024) (0.048)
#>
#> hisp 0.016 0.022
#> (0.021) (0.043)
#>
#> exper 0.067*** 0.106***
#> (0.014) (0.015)
#>
#> I(exper2) -0.002*** -0.005*** -0.005***
#> (0.001) (0.001) (0.001)
#>
#> married 0.108*** 0.064*** 0.047**
#> (0.016) (0.017) (0.018)
#>
#> union 0.182*** 0.106*** 0.080***
#> (0.017) (0.018) (0.019)
#>
#> ------------------------------------------
#> Observations 4,360 4,360 4,360
#> R2 0.189 0.181 0.181
#> ==========================================
#> Note: *p<0.1; **p<0.05; ***p<0.01
来源:https://stackoverflow.com/questions/34706378/calculating-within-between-or-overall-r-square-in-r