问题
I am trying to do a simple forecast of future profit of an organization based on the past records by using regression. I am following this link. For testing purpose, I have changed the sample data and it produced these results:
My actual data will be date and profit, and they will be going up and down rather than in a contiguous increment way. I realized that the method above works for sample data which keep on increasing as the prediction is quite accurate. However, when I changed the data to the one in the screenshot which goes up and down crazily, the prediction is not so accurate anymore.
Just wondering if there is any way to increase the accuracy for the regression as my data will be going up and down.
Thanks!
回答1:
When you do a regression you are fitting a model to the data. In other words you are saying "here is an equation that describes roughly how the data behaves". In the linear regression case the model / equation is:
y = a * x + b
Where x is the input and y is the output. By doing a linear regression you are saying "my data follows a straight line, here is my data, what are the parameters a and b that best fit the data?".
Obviously if your data does not follow a straight line this will work badly. For instance look at this image I found on Google Images.
Clearly you can see the data has some kind of complex wavy shape - it goes up and down and then up again. The linear model is not complex enough to express this shape (it can only do straight lines). So it doesn't fit well.
Since you need a more complex model you have to choose one. There are dozens of standard ones and you can make up your own. All the model is is an equation with some fixed parameters that can be adjusted so that the equation fits your data.
I suggest you play around with the trend line options in Excel or Google Sheets to get a feel for this. See the Trendline Types bit here for some common models.
Note that none of those will work well for monthly profit because none of them are really cyclical. You probably want a model that is a combination of some repeating multipliers to capture month-to-month variations, and then a linear or polynomial component to capture the fact that yearly profit is increasing or decreasing over time.
You don't want a model that is too expressive however, otherwise you will overfit the data (basically it will see patterns in the noise).
来源:https://stackoverflow.com/questions/46881282/simple-regression-prediction-algorithm-in-javascript