Tensorflow batch_norm does not work properly when testing (is_training=False)

倾然丶 夕夏残阳落幕 提交于 2019-12-21 04:46:08

问题


I am training the following model:

with slim.arg_scope(inception_arg_scope(is_training=True)):
    logits_v, endpoints_v = inception_v3(all_v, num_classes=25, is_training=True, dropout_keep_prob=0.8,
                     spatial_squeeze=True, reuse=reuse_variables, scope='vis')
    logits_p, endpoints_p = inception_v3(all_p, num_classes=25, is_training=True, dropout_keep_prob=0.8,
                     spatial_squeeze=True, reuse=reuse_variables, scope='pol')
    pol_features = endpoints_p['pol/features']
    vis_features = endpoints_v['vis/features']

eps = 1e-08
loss = tf.sqrt(tf.maximum(tf.reduce_sum(tf.square(pol_features - vis_features), axis=1, keep_dims=True), eps))

# rest of code
saver = tf.train.Saver(tf.global_variables())

where

def inception_arg_scope(weight_decay=0.00004,
                    batch_norm_decay=0.9997,
                    batch_norm_epsilon=0.001, is_training=True):
normalizer_params = {
    'decay': batch_norm_decay,
    'epsilon': batch_norm_epsilon,
    'is_training': is_training
}
normalizer_fn = tf.contrib.layers.batch_norm

# Set weight_decay for weights in Conv and FC layers.
with slim.arg_scope([slim.conv2d, slim.fully_connected],
                    weights_regularizer=slim.l2_regularizer(weight_decay)):
    with slim.arg_scope([slim.batch_norm, slim.dropout], is_training=is_training):
        with slim.arg_scope(
                [slim.conv2d],
                weights_initializer=slim.variance_scaling_initializer(),
                activation_fn=tf.nn.relu,
                normalizer_fn=normalizer_fn,
                normalizer_params=normalizer_params) as sc:
            return sc

and inception_V3 is defined here. My model trains very well and the loss goes from 60 to less than 1. But when I want to test the model in another file:

with slim.arg_scope(inception_arg_scope(is_training=False)):
    logits_v, endpoints_v = inception_v3(all_v, num_classes=25, is_training=False, dropout_keep_prob=0.8,
                     spatial_squeeze=True, reuse=reuse_variables, scope='vis')
    logits_p, endpoints_p = inception_v3(all_p, num_classes=25, is_training=False, dropout_keep_prob=0.8,
                     spatial_squeeze=True, reuse=reuse_variables, scope='pol')

it gives me none-sense results, or more precisely the loss is 1e-8 for all the train and test samples. When I change is_training=True it gives more logical results but still the loss is bigger than training phase (even when I am testing on the training data) I have the same problem with VGG16. I have %100 accuracy on my test when I am using VGG without batch_norm and 0% when I use batch_norm.

What am I missing here? Thank you,


回答1:


I met the same problem and solved. When you use slim.batch_norm,be sure to use slim.learning.create_train_op instead of tf.train.GradientDecentOptimizer(lr).minimize(loss) or other optimizer. Try it to see if it works!



来源:https://stackoverflow.com/questions/42770757/tensorflow-batch-norm-does-not-work-properly-when-testing-is-training-false

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!