Prevent NA from being used in a lm regresion

廉价感情. 提交于 2019-12-08 19:37:41

问题


I have a vector Y containing future returns and a vector X contain current returns. The last Y element is NA, as the last current return is also the very end of the available series.

X = { 0.1, 0.3, 0.2, 0.5 }
Y = { 0.3, 0.2, 0.5, NA }
Other = { 5500, 222, 523, 3677 }

lm(Y ~ X + Other)

I want to make sure that the last element of each series is not included in the regression. I read the na.action documentation but I'm not clear if this is the default behaviour.

For cor(), is this the correct solution to exclude X[4] and Y[4] from the calculation?

cor(X, Y, use = "pairwise.complete.obs")

回答1:


The factory-fresh default for lm is to disregard observations containing NA values. Since this could be overridden using global options, you might want to explicitly set na.action to na.omit:

> summary(lm(Y ~ X + Other, na.action=na.omit))

Call:
lm(formula = Y ~ X + Other, na.action = na.omit)

[snip]

  (1 observation deleted due to missingness)
  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

As to your second question cor(X,Y,use='pairwise.complete.obs') is correct. Since there are only two variables, cor(X,Y,use='complete.obs') would also produce the expected result.



来源:https://stackoverflow.com/questions/8448019/prevent-na-from-being-used-in-a-lm-regresion

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!