Tensorflow Estimator - High evaluation values on training data

别等时光非礼了梦想. 提交于 2020-01-25 07:53:07

问题


I'm using Tensorflow 1.10 with a custom Estimator. To test my training/evaluation loop, I just feed the same image/label into the network every time, so I expected the network to converge fast, which it does.

I'm also using the same image for evaluation, but get a much bigger loss value than when training. After training 2000 steps the loss is:

INFO:tensorflow:Loss for final step: 0.01181452

but evaluates to:

Eval loss at step 2000: 0.41252694

This seems wrong to me. It looks like the same problem as in this thread. Is there something special to consider, when using the evaluate method of Estimator?


Some more details about my code:

I've defined my model (FeatureNet) like here as an inheritance of tf.keras.Model with init and call method.

My model_fn looks like this:

def model_fn(features, labels, mode):

    resize_shape = (180, 320)
    num_dimensions = 16

    model = featurenet.FeatureNet(resize_shape, num_dimensions=num_dimensions)

    training = (mode == tf.estimator.ModeKeys.TRAIN)
    seg_pred = model(features, training)

    predictions = {
       # Generate predictions (for PREDICT mode)
       "seg_pred": seg_pred
    }
    if mode == tf.estimator.ModeKeys.PREDICT:
        return tf.estimator.EstimatorSpec(mode=mode, predictions=predictions)

    # Calculate Loss (for both TRAIN and EVAL modes)
    seg_loss = tf.reduce_mean(tf.keras.backend.binary_crossentropy(labels['seg_true'], seg_pred))
    loss = seg_loss

    # Configure the Training Op (for TRAIN mode)
    if mode == tf.estimator.ModeKeys.TRAIN:
        optimizer = tf.train.MomentumOptimizer(learning_rate=1e-4, momentum=0.9)

        train_op = optimizer.minimize(loss, global_step=tf.train.get_global_step())

        return tf.estimator.EstimatorSpec(mode=mode, loss=loss, train_op=train_op)

    # Add evaluation metrics (for EVAL mode)
    return tf.estimator.EstimatorSpec(mode=mode, loss=loss)

Then in the main-part I train and evaluate with an custom Estimator:

# Create the Estimator
estimator = tf.estimator.Estimator(
    model_fn=model_fn,
    model_dir="/tmp/discriminative_model"
    )

def input_fn():
    features, labels = create_synthetic_image()

    training_data = tf.data.Dataset.from_tensors((features, labels))
    training_data = training_data.repeat(None)
    training_data = training_data.batch(1)
    training_data = training_data.prefetch(1)
    return training_data

estimator.train(input_fn=lambda: input_fn(), steps=2000)
eval_results = estimator.evaluate(input_fn=lambda: input_fn(), steps=50)
print('Eval loss at step %d: %s' % (eval_results['global_step'], eval_results['loss']))

Where create_synthetic_image creates the same image/label every time.


回答1:


I've found, that the handling of BatchNormalization can cause such errors, like described here.

The usage of the control_dependencies in the model-fn solved the issue for me (see here).

if mode == tf.estimator.ModeKeys.TRAIN:
    optimizer = tf.train.MomentumOptimizer(learning_rate=1e-4, momentum=0.9)

    with tf.control_dependencies(model.get_updates_for(features)):
        train_op = optimizer.minimize(loss, global_step=tf.train.get_global_step())

    return tf.estimator.EstimatorSpec(mode=mode, loss=loss, train_op=train_op)


来源:https://stackoverflow.com/questions/53090463/tensorflow-estimator-high-evaluation-values-on-training-data

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!