Do we need to do differencing of exogenous variables before passing to xreg argument of Arima() in R?

与世无争的帅哥 提交于 2019-12-10 11:38:22

问题


I am trying to build a forecasting model using ARIMAX in R and require some guidance on how covariates are handled in xreg argument.

I understand that, auto.arima function takes care of differencing of covariates while fitting the model (from training period data) and I also don't need to difference the covariates for generating forecasts for test period (future values). However, while fitting the model using Arima() in R with custom (p, d, q) and (P, D, Q)[m] values with d or D greater than 0, do we need to manually do differencing of the covariates? If I do differencing, I get the issue that the differenced covariates matrix is of smaller length than the number of data points of the dependent variable.

How should one handle this?

  • Should I send the covariate matrix as it is i.e. without differencing?
  • Should I do differencing but omit first few observations for which differenced covariate data is not available?
  • Should I keep the actual values for first few rows where difference covariate values are not available and remaining rows to have differenced values?
  • If I have to pass flag variables (1/0) to the xreg matrix, should I do differencing of those as well or cbind the actual values of flag variables with the differenced values of remaining variables?

Also, while generating the forecasts for future period, how do I pass the covariate values (as it is or after differencing)?

I am using the following code:

ndiff <- ifelse(((pdq_order == "auto") || (PDQ_order == "auto")), ndiffs(ts_train_PowerTransformed), pdq_order[2])
nsdiff <- ifelse(((pdq_order == "auto") || (PDQ_order == "auto")), nsdiffs(ts_train_PowerTransformed), PDQ_order$order[2])

# Creating the appropriate covariates matrix after doing differencing

ifelse(nsdiff >= 1
      , ifelse(ndiff >= 1
                , xreg_differenced <- diff(diff(ts_CovariatesData_TrainingPeriod, lag =  PDQ_order$period, differences = nsdiff),  lag = 1, differences = ndiff)
                , xreg_differenced <- diff(ts_CovariatesData_TrainingPeriod , lag =  PDQ_order$period, differences = nsdiff)
                )
      , ifelse(ndiff >= 1
               , xreg_differenced <- diff( ts_CovariatesData,  lag = 1, differences = ndiff)
               , xreg_differenced <- ts_CovariatesData
 )

# Fitting the model
model_arimax <- Arima(ts_train_PowerTransformed, order = pdq_order, seasonal = PDQ_order, xreg = xreg_differenced)) 

# Generating Forecast for the test period
fit.test <- model_arimax %>% forecast(h=length(ts_test), 
                                              xreg = as.data.frame(diff(diff(ts_CovariatesData_TestPeriod, lag =  PDQ_order$period, differences = nsdiff),  lag = 1, differences = ndiff))

Kindly suggest.


回答1:


Arima will difference both the response variable and the xreg variables as specified in the order and seasonal arguments. You should never need to do the differencing yourself.



来源:https://stackoverflow.com/questions/49404108/do-we-need-to-do-differencing-of-exogenous-variables-before-passing-to-xreg-argu

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!