Keras EarlyStopping: Which min_delta and patience to use?

后端 未结 2 883
误落风尘
误落风尘 2021-02-05 18:01

I am new to deep learning and Keras and one of the improvement I try to make to my model training process is to make use of Keras\'s keras.callbacks.EarlyStopping c

相关标签:
2条回答
  • 2021-02-05 18:43

    The role of two parameters is clear from keras documentation.

    min_delta : minimum change in the monitored quantity to qualify as an improvement, i.e. an absolute change of less than min_delta, will count as no improvement.

    patience : number of epochs with no improvement after which training will be stopped.

    Actually there is no standard value for these parameters. You need to analyse the participants(dataset,environment,model-type) of the training process to decide their values.

    (1). patience

    • Dataset - If the dataset has not so good variation for different categories.(example - faces of person of age group 25-30 & 30-35). The change in loss would be slow and also random. - In such cases it is good to have higher value for patience. And vice-versa for a good & clear dataset.
    • Model-Type - When training a GAN model, the accuracy change would be low(maximum cases) and an epoch run will consume good amount of GPU. In such cases its better to save checkpoint files after specific number of epochs with a low value of patience. And then use checkpoints to further improve as required. Analyse similarly for other model types.
    • Runtime Environment - When training on a CPU, an epoch run would be time consuming. So, we prefer a smaller value for patience. And may try larger value with GPU.

    (2). min_delta

    • To decide min_delta, run a few epochs and see the change in error & validation accuracy. Depending on the rate of change, it should be defined. The default value 0 works pretty well in many cases.
    0 讨论(0)
  • 2021-02-05 18:54

    Your parameters are valid first choices.

    However, as pointed out by Akash, this is dependent on the dataset and on how you split your data, e.g. your cross-validation scheme. You might want to observe the behavior of your validation error for your model first and then choose these parameters accordingly.

    Regarding min_delta: I've found that 0 or a choice of << 1 like yours works quite well a lot of times. Again, look at how wildly your error changes first.

    Regarding patience: if you set it to n, you well get the model n epochs after the best model. Common choices lie between 0 and 10, but again, this will depend on your dataset and especially variability within the dataset.

    Finally, EarlyStopping is behaving properly in the example you gave. The optimum that eventually triggered early stopping is found in epoch 4: val_loss: 0.0011. After that, the training finds 5 more validation losses that all lie above or are equal to that optimum and finally terminates 5 epochs later.

    0 讨论(0)
提交回复
热议问题