How to freeze/lock weights of one TensorFlow variable (e.g., one CNN kernel of one layer)

[亡魂溺海] 提交于 2019-12-02 18:35:36

A possible approach is to initialize these specific weights with zeros, and modify the minimization process such that gradients won't be applied to them. It can be done by replacing the call to minimize() with something like:

W_conv2_weights = np.ones((3, 3, 32, 32))
W_conv2_weights[:, :, 29, 13] = 0
W_conv2_weights_const = tf.constant(W_conv2_weights)

optimizer = tf.train.RMSPropOptimizer(0.001)

W_conv2_orig_grads = tf.gradients(loss, W_conv2)
W_conv2_grads = tf.multiply(W_conv2_weights_const, W_conv2_orig_grads)
W_conv2_train_op = optimizer.apply_gradients(zip(W_conv2_grads, W_conv2))

rest_grads = tf.gradients(loss, rest_of_vars)
rest_train_op = optimizer.apply_gradients(zip(rest_grads, rest_of_vars))

tf.group([rest_train_op, W_conv2_train_op])

I.e,

  1. Preparing a constant Tensor for canceling the appropriate gradients
  2. Compute gradients only for W_conv2, then multiply element-wise with the constant W_conv2_weights to zero the appropriate gradients and only then apply gradients.
  3. Compute and apply gradients "normally" to the rest of the variables.
  4. Group the 2 train ops to a single training op.
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!