Calculating within, between or overall R-square in R

问题

I'm migrating from Stata to R (plm package) in order to do panel model econometrics. In Stata, panel models such as random effects usually report the within, between and overall R-squared.

I have found that the reported R-squared in the plm Random Effects models corresponds to the within R squared. So, is there any way to get the overall and between R-squared using the plm package in R?

See same example with R and Stata:

library(plm)
library(foreign) # read Stata files
download.file('http://fmwww.bc.edu/ec-p/data/wooldridge/wagepan.dta','wagepan.dta',mode="wb")
wagepan <- read.dta('wagepan.dta')

# Random effects
plm.re <- plm(lwage ~ educ + black + hisp + exper + expersq + married + union + d81 + d82 + d83 + d84 + d85 + d86 + d87,
              data=wagepan,
              model='random',
              index=c('nr','year'))
summary(plm.re)

In Stata:

use http://fmwww.bc.edu/ec-p/data/wooldridge/wagepan.dta
xtset nr year
xtreg lwage educ  black  hisp  exper  expersq  married  union  d81  d82  d83  d84  d85  d86  d87, re

The R-squared reported in R (0.18062) is, at least in this case, similar to the R-sq Within reported in Stata (0.1799). Is there any way to get in R the R-sq Between (0.1860) and overall (0.1830) reported in Stata?

回答1:

this website has the complete code to reproduce Example 14.4 in Wooldridge 2013 p. 494-5 with R-sq. reported for all models,

# install.packages(c("wooldridge"), dependencies = TRUE) 
# devtools::install_github("JustinMShea/wooldridge")
library(wooldridge) 
data(wagepan)

# install.packages(c("plm", "stargazer","lmtest"), dependencies = TRUE)
library(plm); library(lmtest); library(stargazer)

model <- as.formula("lwage ~ educ + black + hisp + exper+I(exper^2)+married + union+yr")
reg.ols <- plm(model, data = wagepan.p, model="pooling")

reg.re <- plm(lwage ~ educ + black + hisp + exper +
              I(exper^2) + married + union + yr, data = wagepan.p, model="random") 

reg.fe <- plm(lwage ~ I(exper^2) + married+union+yr, data=wagepan.p, model="within")

# Pretty table of selected results (not reporting year dummies)
stargazer(reg.ols,reg.re,reg.fe, type="text",
     column.labels=c("OLS","RE","FE"),
     keep.stat=c("n","rsq"),
     keep=c("ed","bl","hi","exp","mar","un"))

which outputs,

#> ==========================================
#>                   Dependent variable:     
#>              -----------------------------
#>                          lwage            
#>                 OLS       RE        FE    
#>                 (1)       (2)       (3)   
#> ------------------------------------------
#> educ         0.091***  0.092***           
#>               (0.005)   (0.011)           
#>                                           
#> black        -0.139*** -0.139***          
#>               (0.024)   (0.048)           
#>                                           
#> hisp           0.016     0.022            
#>               (0.021)   (0.043)           
#>                                           
#> exper        0.067***  0.106***           
#>               (0.014)   (0.015)           
#>                                           
#> I(exper2)    -0.002*** -0.005*** -0.005***
#>               (0.001)   (0.001)   (0.001) 
#>                                           
#> married      0.108***  0.064***   0.047** 
#>               (0.016)   (0.017)   (0.018) 
#>                                           
#> union        0.182***  0.106***  0.080*** 
#>               (0.017)   (0.018)   (0.019) 
#>                                           
#> ------------------------------------------
#> Observations   4,360     4,360     4,360  
#> R2             0.189     0.181     0.181  
#> ==========================================
#> Note:          *p<0.1; **p<0.05; ***p<0.01

来源：https://stackoverflow.com/questions/34706378/calculating-within-between-or-overall-r-square-in-r

标签

stata

panel-data

plm