R prediction package VS Stata margins

你。 提交于 2020-02-27 07:26:17

问题


I'm switching from Stata to R, and I find inconsistent results when I use prediction to compute marginal pred and the results from the Stata command margins fixing the values of a variable to x. Here is the example:

library(dplyr)
library(prediction)

d <- data.frame(x1 = factor(c(1,1,1,2,2,2), levels = c(1, 2)),
            x2 = factor(c(1,2,3,1,2,3), levels = c(1, 2, 3)),
            x3 = factor(c(1,2,1,2,1,2), levels = c(1, 2)),
            y = c(3.1, 2.8, 2.5, 4.3, 4.0, 3.5))

m2 <- lm(y ~ x1 + x2 + x3, d)
summary(m2)

marg2a <- prediction(m2, at = list(x2 = "1"))
marg2b <- prediction(m2, at = list(x1 = "1"))

marg2a %>%
  select(x1, fitted) %>%
  group_by(x1) %>%
  summarise(error = mean(fitted))

marg2b %>%
  select(x2, fitted) %>%
  group_by(x2) %>%
  summarise(error = mean(fitted))

This is the result:

# A tibble: 2 x 2
      x1    error
   <fctr>    <dbl>
1      1 3.133333
2      2 4.266667


# A tibble: 3 x 2
      x2 error
  <fctr> <dbl>
1      1 3.125
2      2 2.825
3      3 2.425

while if I try to replicate this using Stata's margins, this is the result:

regress y i.x1 i.x2 i.x3
margins i.x1, at(x2 == 1)
margins i.x2, at(x1 == 1)


------------------------------------------------------------------------------
             |            Delta-method
             |     Margin   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
          x1 |
          1  |      3.125   .0829157    37.69   0.017     2.071456    4.178544
          2  |      4.275   .0829157    51.56   0.012     3.221456    5.328544
------------------------------------------------------------------------------

------------------------------------------------------------------------------
             |            Delta-method
             |     Margin   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
          x2 |
          1  |      3.125   .0829157    37.69   0.017     2.071456    4.178544
          2  |      2.825   .0829157    34.07   0.019     1.771456    3.878544
          3  |      2.425   .0829157    29.25   0.022     1.371456    3.478544
------------------------------------------------------------------------------

The margins for x2 are the same in R and Stata, but when it comes to x1 there are differences and I don't know why. Really appreciate any help. Thanks,

P


回答1:


Your Stata and R code are not equivalent. To replicate that Stata code you would need:

> prediction(m2, at = list(x1 = c("1", "2"), x2 = "1"))
Average predictions for 6 observations:
 at(x1) at(x2) value
      1      1 3.125
      2      1 4.275
> prediction(m2, at = list(x2 = c("1", "2", "3"), x1 = "1"))
Average predictions for 6 observations:
 at(x2) at(x1) value
      1      1 3.125
      2      1 2.825
      3      1 2.425

That is because when you say margins i.x1 you are asking for predictions for counterfactual versions of the dataset where x1 is replaced with 1 and then replaced with 2, with the additional constraint that in both counterfactual x2 is held at 1. The same thing is occurring in your second Stata example.

This is due to the fact that Stata's margins command has an ambiguity or rather two syntactic expressions that obtain the same output. One is your code:

. margins i.x1, at(x2 == 1)

Predictive margins                              Number of obs     =          6
Model VCE    : OLS

Expression   : Linear prediction, predict()
at           : x2              =           1

------------------------------------------------------------------------------
             |            Delta-method
             |     Margin   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
          x1 |
          1  |      3.125   .0829156    37.69   0.017     2.071457    4.178543
          2  |      4.275   .0829156    51.56   0.012     3.221457    5.328543
------------------------------------------------------------------------------

The other is more explicit about what is actually happening in the above:

. margins, at(x1 = (1 2) x2 == 1)

Predictive margins                              Number of obs     =          6
Model VCE    : OLS

Expression   : Linear prediction, predict()

1._at        : x1              =           1
               x2              =           1

2._at        : x1              =           2
               x2              =           1

------------------------------------------------------------------------------
             |            Delta-method
             |     Margin   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         _at |
          1  |      3.125   .0829156    37.69   0.017     2.071457    4.178543
          2  |      4.275   .0829156    51.56   0.012     3.221457    5.328543
------------------------------------------------------------------------------


来源:https://stackoverflow.com/questions/45431946/r-prediction-package-vs-stata-margins

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!