How to freeze/lock weights of one TensorFlow variable (e.g., one CNN kernel of one layer)

[亡魂溺海] 提交于 2019-12-02 18:35:36

A possible approach is to initialize these specific weights with zeros, and modify the minimization process such that gradients won't be applied to them. It can be done by replacing the call to minimize() with something like:

W_conv2_weights = np.ones((3, 3, 32, 32))
W_conv2_weights[:, :, 29, 13] = 0
W_conv2_weights_const = tf.constant(W_conv2_weights)

optimizer = tf.train.RMSPropOptimizer(0.001)

W_conv2_orig_grads = tf.gradients(loss, W_conv2)
W_conv2_grads = tf.multiply(W_conv2_weights_const, W_conv2_orig_grads)
W_conv2_train_op = optimizer.apply_gradients(zip(W_conv2_grads, W_conv2))

rest_grads = tf.gradients(loss, rest_of_vars)
rest_train_op = optimizer.apply_gradients(zip(rest_grads, rest_of_vars))[rest_train_op, W_conv2_train_op])


  1. Preparing a constant Tensor for canceling the appropriate gradients
  2. Compute gradients only for W_conv2, then multiply element-wise with the constant W_conv2_weights to zero the appropriate gradients and only then apply gradients.
  3. Compute and apply gradients "normally" to the rest of the variables.
  4. Group the 2 train ops to a single training op.