How to handle Shift in Forecasted value

断了今生、忘了曾经 提交于 2019-12-02 21:04:19

you asked for my help at:

stock prediction : GRU model predicting same given values instead of future stock price

Hope not late. What you can try is that you can divert the numerical explicitness of your features. Let me explain:

Similar to my answer in the previous topic; the regression algorithm will use the value from the time-window you give as a sample, to minimize the error. Let's assume you are trying to predict the closing price of BTC at time t. One of your features consists of previous closing prices and you are giving a time-series window of last 20 inputs from t-20 to t-1. A regressor probably will learn to choose the closing value at time step t-1 or t-2 or a close value in this case, cheating. Think like that: if closing price was $6340 at t-1, predicting $6340 or something close at t+1 would minimize the error at strongest. But actually the algorithm did not learn any patterns; it just replicates, so it basically does nothing but accomplishing its optimization duty.

Think analogously from my example: By diverting the explicitness, what I mean is that: do not give the closing prices directly, but scale them or do not use explicit ones at all. Do not use any features explicitly showing the closing prices to the algorithm, do not use open, high, low etc for every time step. You will need to be creative here, engineer the features to get rid of explicit ones; you can give squared close differences (regressor can still steal from past with linear differences, with experience), its ratio to volume. Or, can make the features categorical by digitizing them in a manner that would make sense to use. The point is do not give direct intuition to what it should predict, only provide patterns for algorithm to work on.

A faster approach may be suggested depending on your task. You can do multi-class classification if predicting how much percent of change that your labels is enough for you, just be careful about class imbalance situations. If even just the up/down fluctuations are enough for you, you can directly go for the binary classification. Replication or shifting problems are only seen at the regression tasks, if you are not leaking data from training to the test set. If possible, get rid out of regression for time-series windowed applications.

If anything misunderstood or missing, I will be around. Hope I could help. Good Luck.

Most likely your LSTM is learning to guess roughly what its previous input value was (modulated a bit). That's why you see a "shift".

So let's say your data looks like:

x = [1, 1, 1, 4, 5, 4, 1, 1]

And your LSTM learned to just output the previous input for the current timestep. Then your output would look like:

y = [?, 1, 1, 1, 4, 5, 4, 1]

Because your network has some complicated machinery it is not quite this straightforward but in principle the "shift" you see is caused by this phenomenon.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!