Error in arima of R: too few non-missing observations

本秂侑毒 提交于 2019-12-24 15:26:20

问题


I am using arima() and auto.arima() of R to get the prediction of sales. The data is at week level for three years.

my code looks like:

x<-c(1571,1501,895,1335,2306,930,2850,1380,975,1080,990,765,615,585,838,555,1449,615,705,465,165,630,330,825,555,720,615,360,765,1080,825,525,885,507,884,1230,342,615,1161, 1585,723,390,690,993,1025,1515,903,990,1510,1638,1461.67,1082,1075,2315,1014,2140,1572,794,1363,1184,1248,1344,1056,816,720,896,608,624,560,512,304,640,640,704,1072,768, 816,640,272,1168,736,1003,864,658.67,768,841,1727,944,848,432,704,850.67,1205,592,1104,976,629,814,1626,933.33,1100.33,1730,2742,1552,1038,826,1888,1440,1372,824,1824,1392,1424,768,464, 960,320,384,512,478,1488,384,338.67,176,624,464,528,592,288,544,418.67,336,752,400,1232,477.67,416,810.67,1256,1040,823,240,1422,704,718,1193,1541,1008,640,752, 1008,864,1507,4123,2176,899,1717,935)

length_data<-length(x)

length_train<-round(length_data*0.80)

forecast_period<-length_data-length_train

train_data<-x[1:length_train]

train_data<-ts(train_data,frequency=52,start=c(1,1))

validation_data<-x[(length_train+1):length_data]

validation_data<-ts(validation_data,frequency=52,start=c(ceiling((length_train)/52),((length_train)%%52+1)))

arima_output<-auto.arima(train_data) # fit the ARIMA Model

arima_validate <- Arima(x=validation_data,model=arima_output)

Error:

Error in stats::arima(x = x, order = order, seasonal = seasonal, include.mean = include.mean, :

too few non-missing observations

What I am doing wrong? What does it mean by "too few non-missing observations"? I have searched it now net, but did not get any better explanation.

Thanks for any kind of help!


回答1:


arima_output is a seasonal ARIMA model:

> arima_output
Series: train_data 
ARIMA(1,0,1)(0,1,0)[52]

Arima() then attempts to refit this particular model to validation_data. But to fit a seasonal model to a time series, you need at least one full year of observations, since seasonal ARIMA depends on seasonal differencing.

As an illustration, note that Arima() will happily and without errors refit a time series that is double as long as validation_data:

validation_data <- x[(length_train+1):length_data]
validation_data<-ts(rep(validation_data,2),frequency=52,
  start=c(ceiling((length_train)/52),((length_train)%%52+1)))
arima_validate <- Arima(x=validation_data,model=arima_output)

One way of dealing with this would be to force auto.arima() to use a nonseasonal model, by specifying D=0:

validation_data <- x[(length_train+1):length_data]
validation_data<-ts(validation_data,frequency=52,
  start=c(ceiling((length_train)/52),((length_train)%%52+1)))
arima_output<-auto.arima(train_data, D=0) # fit the ARIMA Model
arima_validate <- Arima(x=validation_data,model=arima_output)

So this did turn out to be more of a CrossValidated question...




回答2:


Your chosen model is ARIMA(1,0,1)(0,1,0)[52]. That is, it has a seasonal difference of lag 52. Your validation data has 32 observations. So you cannot take the seasonal differences on the validation data without knowing what the training data is.

One way around this is to fit the model to the full time series, and then extract what you want (presumably residuals from the validation portion).

You can also improve the readability of your code:

x <- ts(x, frequency=52, start=c(1,1))
length_data <- length(x)
length_train <- round(length_data*0.80)
train_data <- ts(head(x, length_train), 
                  frequency=frequency(x), start=start(x))
validation_data <- ts(tail(x, length_data-length_train), 
                  frequency=frequency(x), end=end(x))

library(forecast)
arima_train <- auto.arima(train_data) 
arima_full <- Arima(x, model=arima_train)
res <- window(residuals(arima_full), start=start(validation_data))


来源:https://stackoverflow.com/questions/27298311/error-in-arima-of-r-too-few-non-missing-observations

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!