r predict function returning too many values [closed]

我们两清 提交于 2019-12-17 16:53:21

问题


I've read other postings regarding named variables and tried implementing the answers but still get too many values for my new data that I want to run my existing model on. Here is working example code:

set.seed(123)
mydata <- data.frame("y"=rnorm(100,mean=0, sd = 1),"x"=c(1:100))

mylm <- lm(y ~ x, data=mydata)

# ok so mylm is a model on 100 points - lets look at it and the data
par(mfrow=c(2,2))
plot(mylm)
par(mfrow=c(1,1))
predvals <- predict(mylm, data=mydata)
plot(mydata$x,mydata$y)
lines(predvals)

No surprises here - a straight line through generated points - both 100 observations in length. Now I generate 20 points of new data with the exact same names and when I run the new data through predict() I expect to get 20 points and instead I get 100. What am I missing! Driving me crazy....

newdata <- data.frame("y"=rnorm(20,mean=0, sd = 1), "x"=c(1:20))
predvals <- predict(mylm, data=newdata)
length(newdata$y)
length(predvals)    

# quick -not elegant - way to look at it:
plot(predvals)
lines(newdata$x,newdata$y)

Do I need to tell predict() to only use 20 points or something like that?


回答1:


Your issue is in predvals <- predict(mylm, data=newdata).

The correct call is predict(mylm, newdata=newdata). The predict() function in R takes a named argument newdata, not data.



来源:https://stackoverflow.com/questions/33309792/r-predict-function-returning-too-many-values

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!