How to know if underfitting or overfitting is occuring?

后端 未结 1 1072
独厮守ぢ
独厮守ぢ 2020-12-07 05:19

I\'m trying to do image classification with two classes. I have 1000 images with balanced classes. When I train the model, I get a low constant validation accuracy but a dec

相关标签:
1条回答
  • 2020-12-07 05:53

    What is overfitting

    Overfitting ( or underfitting) occurs when a model is too specific (or not specific enough) to the training data, and doesn't extrapolate well to the true domain. I'll just say overfitting from now on to save my poor typing fingers [*]

    I think the wikipedia image is good:

    Clearly, the green line, a decision boundary trying to separate the red class from the blue, is "overfit", because although it will do well on the training data, it lacks the "regularized" form we like to see when generalizing [**].

    These CMU slides on overfitting/cross validation also make the problem clear:

    And here's some more intuition for good measure


    When does overfitting occur, generally?

    Overfitting is observed numerically when the testing error does not reflect the training error

    Obviously, the testing error will always (in expectation) be worse than the training error, but at a certain number of iterations, the loss in testing will start to increase, even as the loss in training continues to decline.


    How to tell when a model has overfit visually?

    Overfitting can be observed by plotting the decision boundary (as in the wikipedia image above) when dimensionality allows, or by looking at testing loss in addition to training loss during the fit procedure

    You don't give us enough points to make these graphs, but here's an example (from someone asking a similar question) showing what those loss graphs would look like:

    While loss curves are sometimes more pretty and logarthmic, note the trend here that training error is still decreasing but testing error is on the rise. That's a big red flag for overfitting. SO discusses loss curves here

    The slightly cleaner and more real-life example is from this CMU lecture on ovefitting ANN's:

    The top graph is overfitting, as before. The bottom graph is not.


    When does this occur?

    When a model has too many parameters, it is susceptible to overfitting (like a n-degree polynomial to n-1 points). Likewise, a model with not enough parameters can be underfit.

    Certain regularization techniques like dropout or batch normalization, or traditionally l-1 regularization combat this. I believe this is beyond the scope of your question.

    Further reading:

    1. A good statistics-SO question and answers
    2. Dense reading: bounds on overfitting with some models
    3. Lighter reading: general overview
    4. The related bias-variance tradeoff

    Footnotes

    [*] There's no reason to keep writing "overfitting/underfitting", since the reasoning is the same for both, but the indicators are flipped, obviously (a decision boundary that hasn't latched onto the true border enough, as opposed to being too tightly wrapped against individual points). In general, overfitting is the more common to avoid, since "more iterations/more parameters" is the current theme. If you have lots of data and not lot of parameters, maybe you really are worried about underfitting, but I doubt it.

    [**] One way to formalize the idea that the black line is preferable than the green one in the first image from wikipedia is to penalize the number of parameters required by your model during model selection

    0 讨论(0)
提交回复
热议问题