问题
I have a time series problem that is a little modified. I have 2 indexed variables, date and user id. for each user id, date, i want to forecast a value.
The interesting part is the date resets for each new user id.
Standard time series problem have for this time period, forecast the next n days.
In my train data, i have for each user id, for jan 1-3, i have their target value.
In my test data, for each user id, test date is jan 4-6.
For both the train and test data, index of the dataframe is the date.
My data
id,date,week_day,target
1,2019-01-01,1,10
1,2019-01-02,2,6
1,2019-01-03,3,7
2,2019-01-01,1,8
2,2019-01-02,1,5
2,2019-01-03,1,4
As you can see, for new id, the date resets. So i cant create a train dataset where i say the first N rows is the train data and the next N rows is test data.
I only kept date and target in the train data with index on the date.
my test dataset
id,date,week_day,target
1,2019-01-4,1,15
1,2019-01-5,2,13
1,2019-01-6,3,8
2,2019-01-4,1,7
2,2019-01-5,1,7
2,2019-01-6,1,4
Like the train dataset, the date resets for each new user id.
i only kept date and user_id, with date as index for test.
My code
(So this is what ive tried but im not sure if i am doing it right)
stepwise_model = auto_arima(df[['target']],exogenous=df[['id']],
start_p=1, start_q=1,
max_p=3, max_q=3, m=12,
start_P=0, seasonal=True,
d=1, D=0, trace=True,
error_action='ignore',
suppress_warnings=True,
stepwise=True)
predicted = stepwise_model.predict(n_periods =len(test), exogenous=test)
So this works, but i thought the n_periods needs to be how many days into the future we want to forecast? (in my case it should be 3) But i used n_periods = length of test data because it gave me an error that the length of periods is not the same as the length of the test data.
Am i doing this right?
来源:https://stackoverflow.com/questions/54411958/how-to-use-arima-for-data-with-2-indexes