my problem is this: I get NA
where I should get some values in the computation of robust standard errors.
I am trying to do a fixed effect panel regression
After some time playing around, it works for me and gives me:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 4.5099e-16 5.2381e-16 0.8610 0.389254
C.MCAP_SEC -5.9769e-07 1.2677e-07 -4.7149 2.425e-06 ***
C.Impact_change -5.3908e-04 7.5601e-05 -7.1306 1.014e-12 ***
C.Mom 3.7560e-04 3.3378e-03 0.1125 0.910406
C.BM -1.6438e-04 1.7368e-05 -9.4645 < 2.2e-16 ***
C.PD 6.2153e-02 3.8766e-02 1.6033 0.108885
C.CashGen -2.7876e-04 1.4031e-02 -0.0199 0.984149
C.NITA -8.1792e-02 3.2153e-02 -2.5438 0.010969 *
C.PE -6.6170e-06 4.0138e-06 -1.6485 0.099248 .
C.PEdummy 1.3143e-02 4.8864e-03 2.6897 0.007154 **
factor(DS_CODE)130324 -5.2497e-16 5.2683e-16 -0.9965 0.319028
factor(DS_CODE)130409 -4.0276e-16 5.2384e-16 -0.7689 0.441986
factor(DS_CODE)130775 -4.4113e-16 5.2424e-16 -0.8415 0.400089
...
This leaves us with the question why it doesn't for you. I guess it has something to do with the format of your data. Is everything numeric? I converted the column classes and it looks like that for me:
str(dat)
'data.frame': 48251 obs. of 12 variables:
$ DS_CODE : chr "902172" "902172" "902172" "902172" ...
$ DNEW : num 2e+05 2e+05 2e+05 2e+05 2e+05 ...
$ MCAP_SEC : num 78122 71421 81907 80010 82462 ...
$ NITA : num 0.135 0.135 0.135 0.135 0.135 ...
$ CashGen : num 0.198 0.198 0.198 0.198 0.198 ...
$ BM : num 0.1074 0.1108 0.097 0.0968 0.0899 ...
$ PE : num 57 55.3 63.1 63.2 68 ...
$ PEdummy : num 0 0 0 0 0 0 0 0 0 0 ...
$ L1.retE1M : num -0.72492 0.13177 0.00122 0.07214 -0.07332 ...
$ Mom : num 0 0 0 0 0 ...
$ PD : num 5.41e-54 1.51e-66 3.16e-80 2.87e-79 4.39e-89 ...
$ Impact_change: num 0 -10.59 -10.43 0.7 -6.97 ...
What does str(data)
return for you?
The plm
package can estimate clustered SEs for panel regressions. The original data is no longer available, so here's an example using dummy data.
require(foreign)
require(plm)
require(lmtest)
test <- read.dta("http://www.kellogg.northwestern.edu/faculty/petersen/htm/papers/se/test_data.dta")
fpm <- plm(y ~ x, test, model='pooling', index=c('firmid', 'year'))
##Arellano clustered by *group* SEs
> coeftest(fpm, vcov=function(x) vcovHC(x, cluster="group", type="HC0"))
t test of coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.029680 0.066939 0.4434 0.6575
x 1.034833 0.050540 20.4755 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
If you're using lm
models (instead of plm
), then the multiwayvcov
package may help.
library("lmtest")
library("multiwayvcov")
data(petersen)
m1 <- lm(y ~ x, data = petersen)
> coeftest(m1, vcov=function(x) cluster.vcov(x, petersen[ , c("firmid")],
df_correction=FALSE))
t test of coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.029680 0.066939 0.4434 0.6575
x 1.034833 0.050540 20.4755 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
For more details see:
See also: