问题
I have got a panel data in R
library(AER)
data(Fatalities)
# define the fatality rate
Fatalities$fatal_rate <- Fatalities$fatal / Fatalities$pop * 10000
# mandadory jail or community service?
Fatalities$punish <- with(Fatalities, factor(jail == "yes" | service == "yes", labels = c("no", "yes")))
I am observing beertax’s effect to their fatal_rate from 1982-1988 within 48 states. Based on the data’s nature, (observing same sample across different years), I am thinking of fixed effect model, and in order to prove that it is the right fit, I first did Exploratory Data Analysis:
coplot(fatal_rate ~ year|state, type="b", data=Fatalities)
scatterplot(fatal_rate ~ year|state, data=Fatalities)
Also the Heterogeniety across state and year separately. The plots shows heterogeneity—can I have my conclusion that I should have fixed effects with entity and time?
plotmeans(fatal_rate ~ state, data=Fatalities)
plotmeans(fatal_rate ~ year, data=Fatalities)
Judging from the plots, I think that I should have fixed entity and fixed time in fixed effect model. However in order to prove it statistically, I did the following tests:
- I check whether there is panel effect in the data (so panel regression or normal OLS would be suitable for my data)
- After I got the result: p-value small so there is panel effect in the data. Then, I use plmtest to test if I should have added time effect as well.
To my surprise, the result showed that I don’t need to add time effect. coz based on the plots, I should have time effect?
- The next step should be using Hausman test to compare fixed model with random model (here I have time effect added)
fixed <- plm(fatal_rate ~ beertax + drinkage + punish + miles + unemp + log(income), index = c("state", "year"),model = "within",effect = "twoways",data = Fatalities) random <- plm(fatal_rate ~ beertax + drinkage + punish + miles + unemp + log(income), index = c("state", "year"),model = "random",data = Fatalities) Hausman test phtest(fixed, random)
The p value is less than 0.05, therefore I can draw conclusion that fixed effect is better.
I am wondering if I have the right steps for choosing the right model for my data, and am I reading the plot right?—Can I draw the conclusion based on the plots that I should have both entity and time fixed effects? why is the “plmtest(fixed, c("time"), type=("bp"))”
showing me a different result? Do I need to have test on “between” and “within” estimator fixed effects, if yes, which test I should carry out?
来源:https://stackoverflow.com/questions/65446646/what-are-the-standard-panel-data-model-selections-and-steps