I am training a mask r-cnn model refer to this rep on github: https://github.com/matterport/Mask_RCNN
I meet a problem which seems to be an issue of using Keras, so
The error is basically what the message says. You cannot have a variable initializer inside a conditional. A crude analogy to normal programming languages is:
if my_condition:
a = 1
print a # can't do this. a might be uninitialized.
Here is a simple example to illustrate this issue and the fix suggested in the error message:
import tensorflow as tf
def cond(i, _):
return i < 10
def body(i, _):
zero = tf.zeros([], dtype=tf.int32)
v = tf.Variable(initial_value=zero)
return (i + 1, v.read_value())
def body_ok(i, _):
zero = lambda: tf.zeros([], dtype=tf.int32)
v = tf.Variable(initial_value=zero, dtype=tf.int32)
return (i + 1, v.read_value())
tf.while_loop(cond, body, [0, 0])
This is using tf.while_loop
but it is the same as tf.cond
for this purposes. If you run this code as is, you will get the same error. If you replace body
with body_ok
everything will be fine. The reason is that when the initializer is a function, tensorflow can place it "outside of the control flow context" to make sure it always runs.
To clarify a possible confusion for future readers, the approach to "compute a
first" is not ideal but for a subtle reason. First, remember that what you are doing here is building a computation graph (assuming you are not using eager execution). So, you are not actually computing a
. You are just defining how it can be computed. Tensorflow runtime decides what needs to be computed at runtime, depending on the arguments to session.run()
. So, one might expect that the if the condition is false, the branch returning a
will not be executed (since it is not needed). Unfortunately, this is not how TensorFlow runtime works. You can find more details in the first answer here, but briefly, TensorFlow runtime will execute all dependencies for either branch, only the operations inside the true_fn/false_fn
will be executed conditionally.
The same problem occurs with me while using keras with CNN-LSTM. The code was working fine on GPU server, but when I tried to run it on my local machine, got this weird error.
The following trick works for me.
Solution: clear variables and restart your kernel. This work for me. Maybe someone else getting to exactly the same problem what I am going through will be helpful.