Error in dataframe *tmp* replacement has x data has y

怎甘沉沦 提交于 2019-11-29 13:46:18

I have a feeling you have NAs in your data. Look at this example:

#mtcars data set
test <- mtcars
#adding just one NA in the cyl column
test[2, 2] <- NA

#running linear model and adding the residuals to the data.frame
test$residuals <- resid(lm(mpg ~ cyl, test))
Error in `$<-.data.frame`(`*tmp*`, "residuals", value = c(0.382245430809409,  : 
  replacement has 31 rows, data has 32

As you can see this results in a similar error to yours.

As a validation:

length(resid(lm(mpg ~ cyl, test)))
#31
nrow(test)
#32

This happens because lm will run na.omit on the data set prior to running the regression, so if you have any rows with NA these will get eliminated resulting in fewer results.

If you run na.omit on your dat data set (i.e. dat <- na.omit(dat) at the very beginning of your code then your code should work.

This is an old thread, but maybe this can help someone else facing the same issue. To LyzandeR's point, check for NA's as a first line of defense. In addition, make sure that you don't have any factors in x, as this can also cause the error.

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!