statespace.SARIMAX model: why the model use all the data to train mode, and predict the a range of train model

后端 未结 2 468
鱼传尺愫
鱼传尺愫 2021-02-06 02:09

I followed the tutorial to study the SARIMAX model: https://www.digitalocean.com/community/tutorials/a-guide-to-time-series-forecasting-with-arima-in-python-3. The date range of

2条回答
  •  执笔经年
    2021-02-06 02:40

    The author is right. When you do a regression (linear, higher-order or logistic - doesn't matter) - it is absolutely ok to have deviations from your training data (for instance - logistic regression even on training data may give you a false positive).

    Same stands for time series. I think this way the author wanted to show that the model is built correctly.

    seasonal_order=(1, 1, 1, 12)
    

    If you look at tsa stats documentation you will see that if you want to operate with quarterly data - you have to assign the last parameter (s) - value of 4. Monthly - 12. It means that if you want to operate with weekly data seasonal_order should look like this

    seasonal_order=(1, 1, 1, 52)
    

    daily data will be

    seasonal_order=(1, 1, 1, 365)
    

    order component is the parameter that is responsible for non-seasonal parameters p, d and q respectively. You have to find them depending on your data behaviour

    • p. You can interpret it as wether has an influence on . Or in other words, if you have a daily data and p is 6 you can understand it as wether Tuesday data will have an influence on Sunday data.
    • d. Differencing parameter. It defines the level of integration of your process. It means how many times you should apply time series differencing operator in order to make your time series stationary
    • q. You can interpret it as how many prior noises (errors) affect the current value

    Here is a good answer how you can find non-seasonal component values

提交回复
热议问题