estimate a repeated measures random effects model with a nested structure using `plm()`

问题

Is it possible to estimate a repeated measures random effects model with a nested structure using plm() from the plm package?

I know it is possible with lmer() from the lme4 package. However, lmer() rely on a likelihood framework and I am curious to do it with plm().

Here's my minimal working example, inspired by this question. First some required packages and data,

# install.packages(c("plm", "lme4", "texreg", "mlmRev"), dependencies = TRUE)
data(egsingle, package = "mlmRev")

the data-set egsingle is a unbalanced panel consisting of 1721 school children, grouped in 60 schools, across five time points. For details see ?mlmRev::egsingle

Some light data management

dta <- egsingle
dta$Female <- with(dta, ifelse(female == 'Female', 1, 0))

Also, a snippet of the relevant data

dta[118:127,c('schoolid','childid','math','year','size','Female')]
#>     schoolid   childid   math year size Female
#> 118     2040 289970511 -1.830 -1.5  502      1
#> 119     2040 289970511 -1.185 -0.5  502      1
#> 120     2040 289970511  0.852  0.5  502      1
#> 121     2040 289970511  0.573  1.5  502      1
#> 122     2040 289970511  1.736  2.5  502      1
#> 123     2040 292772811 -3.144 -1.5  502      0
#> 124     2040 292772811 -2.097 -0.5  502      0
#> 125     2040 292772811 -0.316  0.5  502      0
#> 126     2040 293550291 -2.097 -1.5  502      0
#> 127     2040 293550291 -1.314 -0.5  502      0

Now, relying heavily on Robert Long's answer, this is how I estimate a repeated measures random effects model with a nested structure using lmer() from the lme4 package,

dta$year <- as.factor(dta$year)
require(lme4)
Model.1 <- lmer(math ~ Female + size + year + (1 | schoolid /childid), dta)
# summary(Model.1)

I looked in man page for plm() and it has an indexing command, index, but it only takes a single index and time, i.e., index = c("childid", "year"), ignoring the schoolid the model would look like this,

dta$year <- as.numeric(dta$year) 
library(plm)
Model.2 <- plm(math~Female+size+year, dta, index = c("childid", "year"), model="random")
# summary(Model.2)

To sum up the question

How can I, or is it even possible, to specify a repeated measures random effects model with a nested structure, like Model.1, using plm() from the plm package?

Below is the actual estimation results form the two models,

# require(texreg)
names(Model.2$coefficients) <- names(coefficients(Model.1)$schoolid) #ugly!
texreg::screenreg(list(Model.1, Model.2), digits = 3)  # pretty! 
#> ==============================================================
#>                                    Model 1        Model 2     
#> --------------------------------------------------------------
#> (Intercept)                           -2.693 ***    -2.671 ***
#>                                       (0.152)       (0.085)   
#> Female                                 0.008        -0.025    
#>                                       (0.042)       (0.046)   
#> size                                  -0.000        -0.000 ***
#>                                       (0.000)       (0.000)   
#> year-1.5                               0.866 ***     0.878 ***
#>                                       (0.059)       (0.059)   
#> year-0.5                               1.870 ***     1.882 ***
#>                                       (0.058)       (0.059)   
#> year0.5                                2.562 ***     2.575 ***
#>                                       (0.059)       (0.059)   
#> year1.5                                3.133 ***     3.149 ***
#>                                       (0.059)       (0.060)   
#> year2.5                                3.939 ***     3.956 ***
#>                                       (0.060)       (0.060)   
#> --------------------------------------------------------------
#> AIC                                16590.715                  
#> BIC                                16666.461                  
#> Log Likelihood                     -8284.357                  
#> Num. obs.                           7230          7230        
#> Num. groups: childid:schoolid       1721                      
#> Num. groups: schoolid                 60                      
#> Var: childid:schoolid (Intercept)      0.672                  
#> Var: schoolid (Intercept)              0.180                  
#> Var: Residual                          0.334                  
#> R^2                                                  0.004    
#> Adj. R^2                                             0.003    
#> ==============================================================
#> *** p < 0.001, ** p < 0.01, * p < 0.05

回答1:

Based on Helix123's comment I wrote the following model specification for a repeated measures random effects model with a nested structure, in plm() from the plm package using Wallace and Hussain's (1969) method, i.e. random.method = "walhus", for estimation of the variance components,

p_dta <- pdata.frame(dta, index = c("childid", "year", "schoolid"))        
Model.3 <- plm(math ~ Female + size + year, data = p_dta, model = "random",
               effect = "nested", random.method = "walhus")

The results, seen in Model.3 below, is as close to identical, to the estimates in Model.1, as I could expect. Only the intercept is slightly different (see output below).

I wrote the above based on the example from Baltagi, Song and Jung (2001) provided in ?plm. In the Baltagi, Song and Jung (2001)-example the variance components are estimated first using Swamy and Arora (1972), i.e. random.method = "swar", and second with using Wallace and Hussain's (1969). Only the Nerlove (1971) transformation does not converge using the Song and Jung (2001)-data. Whereas it was only Wallace and Hussain's (1969)-method that could converge using the egsingle data-set.

Any authoritative references on this would be appreciated. I'll keep working at it.

names(Model.3$coefficients) <- names(coefficients(Model.1)$schoolid) 
texreg::screenreg(list(Model.1, Model.3), digits = 3,
                  custom.model.names = c('Model 1', 'Model 3')) 
#> ==============================================================
#>                                    Model 1        Model 3     
#> --------------------------------------------------------------
#> (Intercept)                           -2.693 ***    -2.697 ***
#>                                       (0.152)       (0.152)   
#> Female                                 0.008         0.008    
#>                                       (0.042)       (0.042)   
#> size                                  -0.000        -0.000    
#>                                       (0.000)       (0.000)   
#> year-1.5                               0.866 ***     0.866 ***
#>                                       (0.059)       (0.059)   
#> year-0.5                               1.870 ***     1.870 ***
#>                                       (0.058)       (0.058)   
#> year0.5                                2.562 ***     2.562 ***
#>                                       (0.059)       (0.059)   
#> year1.5                                3.133 ***     3.133 ***
#>                                       (0.059)       (0.059)   
#> year2.5                                3.939 ***     3.939 ***
#>                                       (0.060)       (0.060)   
#> --------------------------------------------------------------
#> AIC                                16590.715                  
#> BIC                                16666.461                  
#> Log Likelihood                     -8284.357                  
#> Num. obs.                           7230          7230        
#> Num. groups: childid:schoolid       1721                      
#> Num. groups: schoolid                 60                      
#> Var: childid:schoolid (Intercept)      0.672                  
#> Var: schoolid (Intercept)              0.180                  
#> Var: Residual                          0.334                  
#> R^2                                                  0.000    
#> Adj. R^2                                            -0.001    
#> ==============================================================
#> *** p < 0.001, ** p < 0.01, * p < 0.05#>

来源：https://stackoverflow.com/questions/49329807/estimate-a-repeated-measures-random-effects-model-with-a-nested-structure-using

标签

lme4

hierarchical-clustering

plm

longitudinal