TensorFlow Custom Estimator - Restore model after small changes in model_fn

前端 未结 1 1290
囚心锁ツ
囚心锁ツ 2021-02-09 12:46

I am using tf.estimator.Estimator for developing my model,

I wrote a model_fn and trained 50,000 iterations, now I want to make a small change

相关标签:
1条回答
  • 2021-02-09 13:15

    TL;DR The easiest way to load variables from a previous checkpoint is to use the function tf.train.init_from_checkpoint(). Just one call to this function inside the model_fn of your Estimator will override the initializers of the corresponding variables.


    First model with two hidden layers

    In more details, suppose you have trained a first model with two hidden layers on MNIST, named model_fn_1. The weights are saved in directory mnist_1.

    def model_fn_1(features, labels, mode):
        images = features['image']
    
        h1 = tf.layers.dense(images, 100, activation=tf.nn.relu, name="h1")
        h2 = tf.layers.dense(h1, 100, activation=tf.nn.relu, name="h2")
    
        logits = tf.layers.dense(h2, 10, name="logits")
    
        loss = tf.losses.sparse_softmax_cross_entropy(labels=labels, logits=logits)
    
        optimizer = tf.train.GradientDescentOptimizer(0.01)
        train_op = optimizer.minimize(loss, global_step=tf.train.get_global_step())
    
        return tf.estimator.EstimatorSpec(mode, loss=loss, train_op=train_op)
    
    # Estimator 1: two hidden layers
    estimator_1 = tf.estimator.Estimator(model_fn_1, model_dir='mnist_1')
    
    estimator_1.train(input_fn=train_input_fn, steps=1000)
    

    Second model with three hidden layers

    Now we want to train a new model model_fn_2 with three hidden layers. We want to load the weights for the first two hidden layers h1and h2. We use tf.train.init_from_checkpoint() to do this:

    def model_fn_2(features, labels, mode, params):
        images = features['image']
    
        h1 = tf.layers.dense(images, 100, activation=tf.nn.relu, name="h1")
        h2 = tf.layers.dense(h1, 100, activation=tf.nn.relu, name="h2")
        h3 = tf.layers.dense(h2, 100, activation=tf.nn.relu, name="h3")
    
        assignment_map = {
            'h1/': 'h1/',
            'h2/': 'h2/'
        }
        tf.train.init_from_checkpoint('mnist_1', assignment_map)
    
        logits = tf.layers.dense(h3, 10, name="logits")
    
        loss = tf.losses.sparse_softmax_cross_entropy(labels=labels, logits=logits)
    
        optimizer = tf.train.GradientDescentOptimizer(0.01)
        train_op = optimizer.minimize(loss, global_step=tf.train.get_global_step())
    
        return tf.estimator.EstimatorSpec(mode, loss=loss, train_op=train_op)
    
    # Estimator 2: three hidden layers
    estimator_2 = tf.estimator.Estimator(model_fn_2, model_dir='mnist_2')
    
    estimator_2.train(input_fn=train_input_fn, steps=1000)
    

    The assignment_map will load every variable from scope h1/ in the checkpoint into the new scope h1/, and same with h2/. Don't forget the / at the end to make TensorFlow know it's a variable scope.


    I couldn't find a way to make this work using pre-made estimators, since you can't change their model_fn.

    0 讨论(0)
提交回复
热议问题