I want to switch from Adam to SGD after a certain number of epochs. How do I do this smoothly so that the weights/gradients are passed over to the new optimizer?
Just define two optimizers and switch between them:
sgd_optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)
adap_optimizer = tf.train.AdamOptimizer(learning_rate).minimize(cost)
...
for epoch in range(100):
for (x, y) in zip(train_X, train_Y):
optimizer = sgd_optimizer if epoch > 50 else adap_optimizer
sess.run(optimizer, feed_dict={X: x, Y: y})
An optimizer only encapsulates the way to apply the gradients to the tensors, and may hold just a few own variables. The model weights are not stored in the optimizers, so you can switch them easily.
来源:https://stackoverflow.com/questions/46850835/how-do-i-switch-tf-train-optimizers-during-training