TensorFlow - regularization with L2 loss, how to apply to all weights, not just last one?

前端 未结 3 551
伪装坚强ぢ
伪装坚强ぢ 2020-12-22 18:10

I am playing with a ANN which is part of Udacity DeepLearning course.

I have an assignment which involves introducing generalization to the network with one hidden R

相关标签:
3条回答
  • 2020-12-22 18:30

    In fact, we usually do not regularize bias terms (intercepts). So, I go for:

    loss = (tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(
        logits=out_layer, labels=tf_train_labels)) +
        0.01*tf.nn.l2_loss(hidden_weights) +
        0.01*tf.nn.l2_loss(out_weights))
    

    By penalizing the intercept term, as the intercept is added to y values, it will result in changing the y values, adding a constant c to the intercepts. Having it or not will not change the results but takes some computations

    0 讨论(0)
  • 2020-12-22 18:54

    hidden_weights, hidden_biases, out_weights, and out_biases are all the model parameters that you are creating. You can add L2 regularization to ALL these parameters as follows :

    loss = (tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(
        logits=out_layer, labels=tf_train_labels)) +
        0.01*tf.nn.l2_loss(hidden_weights) +
        0.01*tf.nn.l2_loss(hidden_biases) +
        0.01*tf.nn.l2_loss(out_weights) +
        0.01*tf.nn.l2_loss(out_biases))
    

    With the note of @Keight Johnson, to not regularize the bias:

    loss = (tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(
        logits=out_layer, labels=tf_train_labels)) +
        0.01*tf.nn.l2_loss(hidden_weights) +
        0.01*tf.nn.l2_loss(out_weights) +
    
    0 讨论(0)
  • 2020-12-22 18:55

    A shorter and scalable way of doing this would be ;

    vars   = tf.trainable_variables() 
    lossL2 = tf.add_n([ tf.nn.l2_loss(v) for v in vars ]) * 0.001
    

    This basically sums the l2_loss of all your trainable variables. You could also make a dictionary where you specify only the variables you want to add to your cost and use the second line above. Then you can add lossL2 with your softmax cross entropy value in order to calculate your total loss.

    Edit : As mentioned by Piotr Dabkowski, the code above will also regularise biases. This can be avoided by adding an if statement in the second line ;

    lossL2 = tf.add_n([ tf.nn.l2_loss(v) for v in vars
                        if 'bias' not in v.name ]) * 0.001
    

    This can be used to exclude other variables.

    0 讨论(0)
提交回复
热议问题