“freeze” some variables/scopes in tensorflow: stop_gradient vs passing variables to minimize

前端未结

关注

 4  1977

I am trying to implement Adversarial NN, which requires to \'freeze\' one or the other part of the graph during alternating training minibatches. I.e. there two sub-networks

相关标签:

4条回答

故里飘歌

2020-11-30 18:41
Another option you might want to consider is you can set trainable=False on a variable. Which means it will not be modified by training.
```
tf.Variable(my_weights, trainable=False)
```
0 讨论(0)
发布评论:

提交评论
- 加载中...

南旧

2020-11-30 18:50

The easiest way to achieve this, as you mention in your question, is to create two optimizer operations using separate calls to opt.minimize(cost, ...). By default, the optimizer will use all of the variables in tf.trainable_variables(). If you want to filter the variables to a particular scope, you can use the optional scope argument to tf.get_collection() as follows:

optimizer = tf.train.AdagradOptimzer(0.01)

first_train_vars = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES,
                                     "scope/prefix/for/first/vars")
first_train_op = optimizer.minimize(cost, var_list=first_train_vars)

second_train_vars = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES,
                                      "scope/prefix/for/second/vars")                     
second_train_op = optimizer.minimize(cost, var_list=second_train_vars)

0 讨论(0)

长情又很酷

2020-11-30 18:51
I don't know if my approach has down sides, but I solved this issue for myself with this construct:
```
do_gradient = <Tensor that evaluates to 0 or 1>
no_gradient = 1 - do_gradient
wrapped_op = do_gradient * original + no_gradient * tf.stop_gradient(original)
```
So if do_gradient = 1, the values and gradients will flow through just fine, but if do_gradient = 0, then the values will only flow through the stop_gradient op, which will stop the gradients flowing back.

For my scenario, hooking do_gradient up to an index of a random_shuffle tensor let me randomly train different pieces of my network.
0 讨论(0)
发布评论:

提交评论
- 加载中...
孤城傲影

2020-11-30 19:00
@mrry's answer is completely right and perhaps more general than what I'm about to suggest. But I think a simpler way to accomplish it is to just pass the python reference directly to var_list:
```
W = tf.Variable(...)
C = tf.Variable(...)
Y_est = tf.matmul(W,C)
loss = tf.reduce_sum((data-Y_est)**2)
optimizer = tf.train.AdamOptimizer(0.001)

# You can pass the python object directly
train_W = optimizer.minimize(loss, var_list=[W])
train_C = optimizer.minimize(loss, var_list=[C])
```
I have a self-contained example here: https://gist.github.com/ahwillia/8cedc710352eb919b684d8848bc2df3a
0 讨论(0)
发布评论:

提交评论
- 加载中...