Error in model.frame.default: variable lengths differ

半腔热情 提交于 2019-11-26 16:34:40

问题


On running a gam model using the mgcv package, I encountered a strange error message which I am unable to understand:

“Error in model.frame.default(formula = death ~ pm10 + Lag(resid1, 1) + : variable lengths differ (found for 'Lag(resid1, 1)')”.

The number of observations used in model1 is exactly the same as the length of the deviance residual, thus I think this error is not related to difference in data size or length.

I found a fairly related error message on the web here, but that post did not receive an adequate answer, so it is not helpful to my problem.

Reproducible example and data follows:

library(quantmod)
library(mgcv) 
require(dlnm)

df <- chicagoNMMAPS
df1 <- df[,c("date","dow","death","temp","pm10")] 
df1$trend<-seq(dim(df1)[1]) ### Create a time trend

Run the model

model1<-gam(death ~ pm10 + s(trend,k=14*7)+ s(temp,k=5),
data=df1, na.action=na.omit, family=poisson)

Obtain deviance residuals

resid1 <- residuals(model1,type="deviance")

Add a one day lagged deviance to model 1

model1_1 <- update(model1,.~.+ Lag(resid1,1),  na.action=na.omit)

model1_2<-gam(death ~ pm10 + s(trend,k=14*7)+ s(temp,k=5) + Lag(resid1,1), data=df1, 
na.action=na.omit, family=poisson)

Both of these models produced the same error message.


回答1:


Joran suggested to first remove the NAs before running the model. Thus, I removed the NAs, run the model and obtained the residuals. When I updated model2 by inclusion of the lagged residuals, the error message did not appear again.

Remove NAs

df2<-df1[complete.cases(df1),]

Run the main model

model2<-gam(death ~ pm10 + s(trend,k=14*7)+ s(temp,k=5), data=df2, family=poisson)

Obtain residuals

resid2 <- residuals(model2,type="deviance")

Update model2 by including the lag 1 residuals

model2_1 <- update(model2,.~.+ Lag(resid2,1),  na.action=na.omit)



回答2:


Another thing that can cause this error is creating a model with the centering/scaling standardize function from the arm package -- m <- standardize(lm(y ~ x, data = train))

If you then try predict(m), you get the same error as in this question.




回答3:


Its simple, just make sure the data type in your columns are the same. For e.g. I faced the same error, that and an another error:

Error in contrasts<-(*tmp*, value = contr.funs[1 + isOF[nn]]) : contrasts can be applied only to factors with 2 or more levels

So, I went back to my excel file or csv file, set a filter on the variable throwing me an error and checked if the distinct datatypes are the same. And... Oh! it had numbers and strings, so I converted numbers to string and it worked just fine for me.



来源:https://stackoverflow.com/questions/19771284/error-in-model-frame-default-variable-lengths-differ

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!