I am currently trying to evaluate a tslm-model using timeseries cross validation. I want to use a fixed model (without parameter reestimation) an look at the 1 to 3 step ahead horizon forecasts for the evaluation period of the last year.
I have trouble to get tsCV
and tslm
from the forecast-library to work well together. What am I missing?
library(forecast)
library(ggfortify)
AirPassengers_train <- head(AirPassengers, 100)
AirPassengers_test <- tail(AirPassengers, 44)
## Holdout Evaluation
n_train <- length(AirPassengers_train)
n_test <- length(AirPassengers_test)
pred_train <- ts(rnorm(n_train))
pred_test <- ts(rnorm(n_test))
fit <- tslm(AirPassengers_train ~ trend + pred_train)
forecast(fit, newdata = data.frame(pred_train = pred_test)) %>%
accuracy(AirPassengers_test)
#> ME RMSE MAE MPE MAPE MASE
#> Training set 1.135819e-15 30.03715 23.41818 -1.304311 10.89785 0.798141
#> Test set 3.681350e+01 76.39219 55.35298 6.513998 11.96379 1.886546
#> ACF1 Theil's U
#> Training set 0.6997632 NA
#> Test set 0.7287923 1.412804
## tsCV Evaluation
fc_reg <- function(x) forecast(x, newdata = data.frame(pred_train = pred_test),
h = h, model = fit)
tsCV(AirPassengers_test, fc_reg, h = 1)
#> Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
#> 1957 NA NA NA NA NA NA NA NA
#> 1958 NA NA NA NA NA NA NA NA NA NA NA NA
#> 1959 NA NA NA NA NA NA NA NA NA NA NA NA
#> 1960 NA NA NA NA NA NA NA NA NA NA NA NA
forecast(AirPassengers_test, newdata = data.frame(pred_train = pred_test),
h = 1, model = fit)
#> Error in forecast.ts(AirPassengers_test, newdata = data.frame(pred_train = pred_test),
#> : Unknown model class
I have a feeling, that https://gist.github.com/robjhyndman/d9eb5568a78dbc79f7acc49e22553e96 is relevant. How would I apply it to the scenario above?
For time series cross-validation, you should be fitting a separate model to every training set, not passing an existing model. With predictor variables, the function needs to be able to grab the relevant elements when fitting each model, and other elements when producing forecasts.
The following will work.
fc <- function(y, h, xreg)
{
if(NROW(xreg) < length(y) + h)
stop("Not enough xreg data for forecasting")
X <- xreg[seq_along(y),]
fit <- tslm(y ~ X)
X <- xreg[length(y)+seq(h),]
forecast(fit, newdata=X)
}
# Predictors of the same length as the data
# and with the same time series characteristics.
pred <- ts(rnorm(length(AirPassengers)), start=start(AirPassengers),
frequency=frequency(AirPassengers))
# Now pass the whole time series and the corresponding predictors
tsCV(AirPassengers, fc, xreg=pred)
If you have more than one predictor variable, then xreg
should be a matrix.
I ended up using a function to forecast a trend. I'm not sure if this is correctly specified but the rmse looks about right.
flm <- function(y, h) { forecast(tslm(y ~ trend, lambda=0), h=h) }
e <- tsCV(tsDF, flm, h=6)
sqrt(mean(e^2, na.rm=TRUE))
@robhyndman
来源:https://stackoverflow.com/questions/50255912/timeseries-crossvalidation-in-r-using-tscv-with-tslm-models