问题
When I run (what I think is the same regression) in R and in Julia I get very different results. I think this is because the IV regression is using an indicator variable to instrument for another indicator variable but I can't figure out if I am doing something else wrong.
I have tried several different methods, but the data starts like this:
rc D tau May_ret Jun_ret Jul_ret Aug_ret Sep_ret
1 -43 0 0 0.04529617 0.02106667 0.009868421 0.032573290 0.010473186
2 -19 0 0 0.01973333 0.05752213 -0.020920502 0.027521368 -0.029535865
3 74 1 1 0.33505189 0.04494382 -0.150537640 0.246835440 0.010152284
4 54 1 1 0.03602649 0.06168831 0.030581040 0.002611276 -0.027027028
5 22 1 1 -0.01584158 -0.08417509 -0.088235296 0.012903226 0.100000001
6 7 1 1 0.02484472 0.08000000 0.039548021 0.065217391 0.006122449
D and tau are not perfectly linear, but they are close (and tau is being used to instrument for D).
My R code is
library(AER)
library(stargazer)
temp_df <- read.csv(file="output.csv")
addmay <- ivreg(May_ret ~ D*rc|tau*rc, data=temp_df)
addjun <- ivreg(Jun_ret ~ D*rc|tau*rc, data=temp_df)
addjul <- ivreg(Jul_ret ~ D*rc|tau*rc, data=temp_df)
addaug <- ivreg(Aug_ret ~ D*rc|tau*rc, data=temp_df)
addsep <- ivreg(Sep_ret ~ D*rc|tau*rc, data=temp_df)
stargazer(addmay, addjun, addjul, addaug, addsep, title="Addition Return Effect", align=TRUE, type="text", report="vc*t")
Which produces
========================================================================================================
Dependent variable:
------------------------------------------------------------------------------------
May_ret Jun_ret Jul_ret Aug_ret Sep_ret
(1) (2) (3) (4) (5)
--------------------------------------------------------------------------------------------------------
D 0.006 0.033 -0.008 0.028 -0.044*
t = 0.304 t = 1.600 t = -0.353 t = 1.283 t = -1.943
rc -0.0001 -0.0001 -0.0003 -0.0003* 0.0002
t = -0.736 t = -0.730 t = -1.495 t = -1.784 t = 1.094
D:rc 0.0002 -0.0002 0.001*** 0.0002 0.0001
t = 0.817 t = -0.747 t = 2.660 t = 0.706 t = 0.392
Constant 0.0002 -0.021* -0.024* -0.021* 0.009
t = 0.014 t = -1.778 t = -1.831 t = -1.673 t = 0.654
--------------------------------------------------------------------------------------------------------
Observations 827 828 824 823 818
R2 0.003 0.007 0.007 0.026 0.012
Adjusted R2 -0.0005 0.004 0.004 0.023 0.009
Residual Std. Error 0.106 (df = 823) 0.111 (df = 824) 0.125 (df = 820) 0.116 (df = 819) 0.123 (df = 814)
========================================================================================================
Note: *p<0.1; **p<0.05; ***p<0.01
And my Julia code is
using FixedEffectModels, CSV, RegressionTables;
test_df = CSV.read("output.csv");
additions = Dict("May" => reg(test_df, @model(May_ret ~ (D*rc~tau*rc))),
"Jun" => reg(test_df, @model(Jun_ret ~ (D*rc~tau*rc))),
"Jul" => reg(test_df, @model(Jul_ret ~ (D*rc~tau*rc))),
"Aug" => reg(test_df, @model(Aug_ret ~ (D*rc~tau*rc))),
"Sep" => reg(test_df, @model(Sep_ret ~ (D*rc~tau*rc))));
regtable(additions["May"], additions["Jun"], additions["Jul"], additions["Aug"], additions["Sep"]; below_statistic=:tstat, regression_statistics=[:nobs, :r2, :f])
Which produces
------------------------------------------------------------------
May_ret Jun_ret Jul_ret Aug_ret Sep_ret
-------- -------- -------- -------- --------
(1) (2) (3) (4) (5)
------------------------------------------------------------------
(Intercept) 0.008 -0.013** -0.006 -0.021 -0.004
(1.580) (-2.603) (-1.117) (-1.673) (-0.771)
D -0.004 0.023 -0.032 0.028 -0.027
(-0.268) (1.503) (-1.854) (1.283) (-1.627)
rc 0.000*** 0.000*** 0.000*** -0.000 0.000***
(NaN) (NaN) (NaN) (-1.784) (NaN)
D & rc 0.000 -0.000 0.001* 0.000 0.000
(0.496) (-1.372) (2.198) (0.706) (1.189)
------------------------------------------------------------------
Estimator IV IV IV IV IV
------------------------------------------------------------------
N 827 828 824 823 818
R2 0.000 0.008 0.004 0.026 0.004
F 0.200 6.306 9.604 8.526 7.380
------------------------------------------------------------------
As can be seen, the main difference is FixedEffectsModels seems to be dropping the rc variable from the regression. I have checked for multicollinearity but it does not seem like it is the problem. Is there a part of my code that I am using wrong?
I am trying to replicate Table 4 of Yen-Cheng Chang, Harrison Hong, Inessa Liskovich, Regression Discontinuity and the Price Effects of Stock Market Indexing, The Review of Financial Studies, Volume 28, Issue 1, January 2015, Pages 212–246, https://doi.org/10.1093/rfs/hhu041
And they provide part of the data. I uploaded the full data here: https://github.com/junder873/data.git
回答1:
After some work, I figured out my own answer. Julia FixedEffectModels was including the rc variable in both the endogenous and exogenous part of the regression. The correct way to fix this is to basically interact the variables beforehand. This will create the correct output. For example:
reg(test_df, @model(May_ret ~ rc + (rc_D + D~rc_tau+ tau)))
Where rc_D
is rc.*D
and rc_tau
is rc.*tau
, basically creating two new columns and adding them manually to the regression instead of letting the model decide which columns to add. This produces the same results.
来源:https://stackoverflow.com/questions/55755226/julia-fixedeffectsmodels-iv-regression-doesnt-match-r-iv-regression