How to freeze/lock weights of one TensorFlow variable (e.g., one CNN kernel of one layer)

前端 未结 1 698
太阳男子
太阳男子 2021-01-31 12:59

I have a TensorFlow CNN model that is performing well and we would like to implement this model in hardware; i.e., an FPGA. It\'s a relatively small network but it would be idea

1条回答
  •  醉梦人生
    2021-01-31 13:23

    A possible approach is to initialize these specific weights with zeros, and modify the minimization process such that gradients won't be applied to them. It can be done by replacing the call to minimize() with something like:

    W_conv2_weights = np.ones((3, 3, 32, 32))
    W_conv2_weights[:, :, 29, 13] = 0
    W_conv2_weights_const = tf.constant(W_conv2_weights)
    
    optimizer = tf.train.RMSPropOptimizer(0.001)
    
    W_conv2_orig_grads = tf.gradients(loss, W_conv2)
    W_conv2_grads = tf.multiply(W_conv2_weights_const, W_conv2_orig_grads)
    W_conv2_train_op = optimizer.apply_gradients(zip(W_conv2_grads, W_conv2))
    
    rest_grads = tf.gradients(loss, rest_of_vars)
    rest_train_op = optimizer.apply_gradients(zip(rest_grads, rest_of_vars))
    
    tf.group([rest_train_op, W_conv2_train_op])
    

    I.e,

    1. Preparing a constant Tensor for canceling the appropriate gradients
    2. Compute gradients only for W_conv2, then multiply element-wise with the constant W_conv2_weights to zero the appropriate gradients and only then apply gradients.
    3. Compute and apply gradients "normally" to the rest of the variables.
    4. Group the 2 train ops to a single training op.

    0 讨论(0)
提交回复
热议问题