问题
I want to implement a two-step learning process where:
1) pre-train a model for a few epochs using the loss function loss_1
2) change the loss function to loss_2
and continue the training for fine-tuning
Currently, my approach is:
model.compile(optimizer=opt, loss=loss_1, metrics=['accuracy'])
model.fit_generator(…)
model.compile(optimizer=opt, loss=loss_2, metrics=['accuracy’])
model.fit_generator(…)
Note that the optimizer remains the same, and only the loss function changes. I'd like to smoothly continue training, but with a different loss function. According to this post, re-compiling the model loses the optimizer state. Questions:
a) Will I lose the optimizer state even if I use the same optimizer, eg Adam?
b) if the answer to a) is yes, any suggestions on how to change the loss function to a new one without reseting the optimizer state?
EDIT:
As suggested by Simon Caby and based on this thread, I created a custom loss function with two loss computations that depend on epoch number. However, it does not work for me. My approach:
def loss_wrapper(t_change, current_epoch):
def custom_loss(y_true, y_pred):
c_epoch = K.get_value(current_epoch)
if c_epoch < t_change:
# compute loss_1
else:
# compute loss_2
return custom_loss
And I compile as follows, after initializing current_epoch
:
current_epoch = K.variable(0.)
model.compile(optimizer=opt, loss=loss_wrapper(5, current_epoch), metrics=...)
To update the current_epoch
, I create a callback:
class NewCallback(Callback):
def __init__(self, current_epoch):
self.current_epoch = current_epoch
def on_epoch_end(self, epoch, logs={}):
K.set_value(self.current_epoch, epoch)
model.fit_generator(..., callbacks=[NewCallback(current_epoch)])
The callback updates self.current_epoch
every epoch correctly. But the update does not reach the custom loss function. Instead, current_epoch
keeps the initialization value forever, and loss_2
is never executed.
Any suggestion is welcome, thanks!
回答1:
My answers : a) yes, and you should probably make your own learning rate scheduler in order to keep control of it :
keras.callbacks.LearningRateScheduler(schedule, verbose=0)
b) yes you can create your own loss function, including one that flutuates between two different loss methods. see : "Advanced Keras — Constructing Complex Custom Losses and Metrics" https://towardsdatascience.com/advanced-keras-constructing-complex-custom-losses-and-metrics-c07ca130a618
回答2:
If you change:
def loss_wrapper(t_change, current_epoch):
def custom_loss(y_true, y_pred):
c_epoch = K.get_value(current_epoch)
if c_epoch < t_change:
# compute loss_1
else:
# compute loss_2
return custom_loss
to:
def loss_wrapper(t_change, current_epoch):
def custom_loss(y_true, y_pred):
# compute loss_1 and loss_2
bool_case_1=K.less(current_epoch,t_change)
num_case_1=K.cast(bool_case_1,"float32")
loss = (num_case_1)*loss_1 + (1-num_case_1)*loss_2
return loss
return custom_loss
it works.
We are essentially required to turn python code into compositions of backend functions for the loss to work without having to update in a re-compile of model.compile(...)
. I am not satisfied with these hacks, and wish it was possible to set model.loss
in a callback without re-compiling model.compile(...)
after (since then the optimizer states are reset).
来源:https://stackoverflow.com/questions/55406146/resume-training-with-different-loss-function