R's sandwich package producing strange results for robust standard errors in linear model

只谈情不闲聊 提交于 2019-12-12 03:08:14

问题


I am trying to find heteroskedasticity-robust standard errors in R, and most solutions I find are to use the coeftest and sandwich packages. However, when I use those packages, they seem to produce queer results (they're way too significant). Both my professor and I agree that the results don't look right. Could someone please tell me where my mistake is? Am I using the right package? Does the package have a bug in it? What should I use instead? Or can you reproduce the same results in STATA?

(The data is CPS data from 2010 to 2014, March samples. I created a MySQL database to hold the data and am using the survey package to help analyze it.)

Thank you in advance. (I have abridged the code somewhat to make it easier to read; let me know if you need to see more.)

>male.nat.reg <- svyglm(log(HOURWAGE) ~ AGE + I(AGE^2) + ... + OVERWORK, subset(fwyrnat2010.design, FEMALE == 0))

>summary(male.nat.reg)

Call:
NextMethod(formula = "svyglm", design)

Survey design:
subset(fwyrnat2010.design, FEMALE == 0)

Coefficients:
           Estimate Std. Error t value Pr(>|t|)    
(Intercept)   1.599e+00  6.069e-02  26.350  < 2e-16 ***
AGE           4.030e-02  3.358e-03  12.000  < 2e-16 ***
I(AGE^2)     -4.131e-04  4.489e-05  -9.204 9.97e-16 ***
NOHSDEG      -1.730e-01  1.281e-02 -13.510  < 2e-16 ***
ASSOC         1.138e-01  1.256e-02   9.060 2.22e-15 ***
SOMECOLL      5.003e-02  9.445e-03   5.298 5.11e-07 ***
BACHELOR      2.148e-01  1.437e-02  14.948  < 2e-16 ***
GRADUATE      3.353e-01  3.405e-02   9.848  < 2e-16 ***
INMETRO       3.879e-02  9.225e-03   4.205 4.93e-05 ***
NCHILDOLD     1.374e-02  4.197e-03   3.273 0.001376 ** 
NCHILDYOUNG   2.334e-02  6.186e-03   3.774 0.000247 ***
NOTWHITE     -5.026e-02  8.583e-03  -5.856 3.92e-08 ***
MARRIED      -8.226e-03  1.531e-02  -0.537 0.592018    
NEVERMARRIED -4.644e-02  1.584e-02  -2.932 0.004009 ** 
NOTCITIZEN   -6.759e-02  1.574e-02  -4.295 3.47e-05 ***
STUDENT      -1.231e-01  1.975e-02  -6.231 6.52e-09 ***
VET           3.336e-02  1.751e-02   1.905 0.059091 .  
INUNION       2.366e-01  1.271e-02  18.614  < 2e-16 ***
PROFOCC       2.559e-01  1.661e-02  15.413  < 2e-16 ***
TSAOCC        9.997e-02  1.266e-02   7.896 1.27e-12 ***
FFFOCC        2.076e-02  2.610e-02   0.795 0.427859    
PRODOCC       2.164e-01  1.281e-02  16.890  < 2e-16 ***
LABOROCC      6.074e-02  1.253e-02   4.850 3.60e-06 ***
AFFIND        6.834e-02  2.941e-02   2.324 0.021755 *  
MININGIND     3.034e-01  3.082e-02   9.846  < 2e-16 ***
CONSTIND      1.451e-01  1.524e-02   9.524  < 2e-16 ***
MANUFIND      1.109e-01  1.393e-02   7.963 8.80e-13 ***
UTILIND       1.422e-01  1.516e-02   9.379 3.78e-16 ***
WHOLESALEIND  2.884e-02  1.766e-02   1.633 0.104910    
FININD        6.215e-02  2.084e-02   2.983 0.003436 ** 
BUSREPIND     6.588e-02  1.755e-02   3.753 0.000266 ***
SERVICEIND    5.412e-02  2.403e-02   2.252 0.026058 *  
ENTERTAININD -1.192e-01  3.060e-02  -3.896 0.000159 ***
PROFIND       1.536e-01  1.854e-02   8.285 1.55e-13 ***
OVERWORK      6.738e-02  1.007e-02   6.693 6.59e-10 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for gaussian family taken to be 0.1367476)

Number of Fisher Scoring iterations: 2

>coeftest(male.nat.reg, vcov = vcovHC(male.nat.reg, type = 'HC0'))

z test of coefficients:

                Estimate  Std. Error   z value  Pr(>|z|)    
(Intercept)   1.5992e+00  9.7176e-08  16456481 < 2.2e-16 ***
AGE           4.0296e-02  5.4766e-09   7357823 < 2.2e-16 ***
I(AGE^2)     -4.1314e-04  7.3222e-11  -5642330 < 2.2e-16 ***
NOHSDEG      -1.7305e-01  1.4431e-08 -11991482 < 2.2e-16 ***
ASSOC         1.1378e-01  1.4248e-08   7985751 < 2.2e-16 ***
SOMECOLL      5.0035e-02  9.9689e-09   5019088 < 2.2e-16 ***
BACHELOR      2.1476e-01  2.0588e-08  10430993 < 2.2e-16 ***
GRADUATE      3.3533e-01  8.3327e-08   4024301 < 2.2e-16 ***
INMETRO       3.8790e-02  8.9666e-09   4326013 < 2.2e-16 ***
NCHILDOLD     1.3738e-02  5.2244e-09   2629554 < 2.2e-16 ***
NCHILDYOUNG   2.3344e-02  5.5405e-09   4213300 < 2.2e-16 ***
NOTWHITE     -5.0261e-02  1.0150e-08  -4951908 < 2.2e-16 ***
MARRIED      -8.2263e-03  1.8867e-08   -436026 < 2.2e-16 ***
NEVERMARRIED -4.6440e-02  1.7847e-08  -2602096 < 2.2e-16 ***
NOTCITIZEN   -6.7594e-02  2.4446e-08  -2765080 < 2.2e-16 ***
STUDENT      -1.2306e-01  3.2514e-08  -3785014 < 2.2e-16 ***
VET           3.3356e-02  3.0996e-08   1076125 < 2.2e-16 ***
INUNION       2.3659e-01  1.7786e-08  13301699 < 2.2e-16 ***
PROFOCC       2.5594e-01  2.2177e-08  11540563 < 2.2e-16 ***
TSAOCC        9.9971e-02  1.6707e-08   5983922 < 2.2e-16 ***
FFFOCC        2.0762e-02  2.3625e-08    878801 < 2.2e-16 ***
PRODOCC       2.1638e-01  1.3602e-08  15907683 < 2.2e-16 ***
LABOROCC      6.0741e-02  1.3445e-08   4517854 < 2.2e-16 ***
AFFIND        6.8342e-02  3.2895e-08   2077563 < 2.2e-16 ***
MININGIND     3.0343e-01  3.2948e-08   9209326 < 2.2e-16 ***
CONSTIND      1.4512e-01  2.1871e-08   6635457 < 2.2e-16 ***
MANUFIND      1.1094e-01  1.9636e-08   5649569 < 2.2e-16 ***
UTILIND       1.4216e-01  2.0930e-08   6792029 < 2.2e-16 ***
WHOLESALEIND  2.8842e-02  1.8662e-08   1545525 < 2.2e-16 ***
FININD        6.2147e-02  2.8214e-08   2202691 < 2.2e-16 ***
BUSREPIND     6.5883e-02  2.7866e-08   2364269 < 2.2e-16 ***
SERVICEIND    5.4118e-02  2.4758e-08   2185907 < 2.2e-16 ***
ENTERTAININD -1.1922e-01  2.9474e-08  -4044852 < 2.2e-16 ***
PROFIND       1.5364e-01  3.0132e-08   5098879 < 2.2e-16 ***
OVERWORK      6.7376e-02  1.0981e-08   6135525 < 2.2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

回答1:


The sandwich package is object-oriented and essentially relies on two methods being available: estfun() and bread(), see the package vignettes for more details. For objects of class svyglm these methods are not available but as svyglm objects inherit from glm the glm methods are found and used. I suspect that this leads to incorrect results in the survey context though, possibly by a weighting factor or so. I'm not familiar enough with the survey package to provide a workaround. The survey maintainer might be able to say more... Hope that helps.



来源:https://stackoverflow.com/questions/28828381/rs-sandwich-package-producing-strange-results-for-robust-standard-errors-in-lin

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!