问题
Is there any way that I can log the validaton loss and accuracy to tensorboard when using tf-slim? When I was using keras, the following code can do this for me:
model.fit_generator(generator=train_gen(), validation_data=valid_gen(),...)
Then the model will evaluate the validation loss and accuracy after each epoch, which is very convenient. But how to achieve this using tf-slim? The following steps are using primitive tensorflow, which is not what I want:
with tf.Session() as sess:
for step in range(100000):
sess.run(train_op, feed_dict={X: X_train, y: y_train})
if n % batch_size * batches_per_epoch == 0:
print(sess.run(train_op, feed_dict={X: X_train, y: y_train}))
Right now, the steps to train a model using tf-slim is:
tf.contrib.slim.learning.train(
train_op=train_op,
logdir="logs",
number_of_steps=10000,
log_every_n_steps = 10,
save_summaries_secs=1
)
So how to evaluate validation loss and accuracy after each epoch with the above slim training procedure?
Thanks in advance!
回答1:
The matter is still being discussed on TF Slim repo (issue #5987). The framework allows you to easily create an evaluation script to run after / in parallel of your training (solution 1 below), but some people are pushing to be able to implement the "classic cycle of batch training + validation" (solution 2).
1. Use slim.evaluation
in another script
TF Slim has evaluation methods e.g. slim.evaluation.evaluation_loop()
you can use in another script (which can be run in parallel of your training) to periodically load the latest checkpoint of your model and perform evaluation. TF Slim page contains a good example how such a script may look: example.
2. Provide a custom train_step_fn
to slim.learning.train()
A patchy solution the initiator of the discussion came up with makes use of a custom training step function you can provide to slim.learning.train()
:
"""
Snippet from code by Kevin Malakoff @kmalakoff
https://github.com/tensorflow/tensorflow/issues/5987#issue-192626454
"""
# ...
accuracy_validation = slim.metrics.accuracy(
tf.argmax(predictions_validation, 1),
tf.argmax(labels_validation, 1)) # ... or whatever metrics needed
def train_step_fn(session, *args, **kwargs):
total_loss, should_stop = train_step(session, *args, **kwargs)
if train_step_fn.step % FLAGS.validation_check == 0:
accuracy = session.run(train_step_fn.accuracy_validation)
print('Step %s - Loss: %.2f Accuracy: %.2f%%' % (str(train_step_fn.step).rjust(6, '0'), total_loss, accuracy * 100))
# ...
train_step_fn.step += 1
return [total_loss, should_stop]
train_step_fn.step = 0
train_step_fn.accuracy_validation = accuracy_validation
slim.learning.train(
train_op,
FLAGS.logs_dir,
train_step_fn=train_step_fn,
graph=graph,
number_of_steps=FLAGS.max_steps
)
来源:https://stackoverflow.com/questions/49905773/how-to-log-validation-loss-and-accuracy-using-tfslim