问题
I am trying to build a forecasting model using ARIMAX in R and require some guidance on how covariates are handled in xreg argument.
I understand that, auto.arima function takes care of differencing of covariates while fitting the model (from training period data) and I also don't need to difference the covariates for generating forecasts for test period (future values). However, while fitting the model using Arima() in R with custom (p, d, q) and (P, D, Q)[m] values with d or D greater than 0, do we need to manually do differencing of the covariates? If I do differencing, I get the issue that the differenced covariates matrix is of smaller length than the number of data points of the dependent variable.
How should one handle this?
- Should I send the covariate matrix as it is i.e. without differencing?
- Should I do differencing but omit first few observations for which differenced covariate data is not available?
- Should I keep the actual values for first few rows where difference covariate values are not available and remaining rows to have differenced values?
- If I have to pass flag variables (1/0) to the xreg matrix, should I do differencing of those as well or cbind the actual values of flag variables with the differenced values of remaining variables?
Also, while generating the forecasts for future period, how do I pass the covariate values (as it is or after differencing)?
I am using the following code:
ndiff <- ifelse(((pdq_order == "auto") || (PDQ_order == "auto")), ndiffs(ts_train_PowerTransformed), pdq_order[2])
nsdiff <- ifelse(((pdq_order == "auto") || (PDQ_order == "auto")), nsdiffs(ts_train_PowerTransformed), PDQ_order$order[2])
# Creating the appropriate covariates matrix after doing differencing
ifelse(nsdiff >= 1
, ifelse(ndiff >= 1
, xreg_differenced <- diff(diff(ts_CovariatesData_TrainingPeriod, lag = PDQ_order$period, differences = nsdiff), lag = 1, differences = ndiff)
, xreg_differenced <- diff(ts_CovariatesData_TrainingPeriod , lag = PDQ_order$period, differences = nsdiff)
)
, ifelse(ndiff >= 1
, xreg_differenced <- diff( ts_CovariatesData, lag = 1, differences = ndiff)
, xreg_differenced <- ts_CovariatesData
)
# Fitting the model
model_arimax <- Arima(ts_train_PowerTransformed, order = pdq_order, seasonal = PDQ_order, xreg = xreg_differenced))
# Generating Forecast for the test period
fit.test <- model_arimax %>% forecast(h=length(ts_test),
xreg = as.data.frame(diff(diff(ts_CovariatesData_TestPeriod, lag = PDQ_order$period, differences = nsdiff), lag = 1, differences = ndiff))
Kindly suggest.
回答1:
Arima
will difference both the response variable and the xreg variables as specified in the order and seasonal arguments. You should never need to do the differencing yourself.
来源:https://stackoverflow.com/questions/49404108/do-we-need-to-do-differencing-of-exogenous-variables-before-passing-to-xreg-argu