Can cond support TF ops with side effects?

后端 未结 2 1042
旧时难觅i
旧时难觅i 2020-12-06 17:34

The (source code) documentation for tf.cond is unclear on whether the functions to be performed when the predicate is evaluated can have side effects or not. I\

相关标签:
2条回答
  • 2020-12-06 18:07

    Your second version—where the assign_add() and assign_sub() ops are creating inside the lambdas passed to cond()—is the correct way to do this. Fortunately, each of the two lambdas is only evaluated once, during the call to cond(), so your graph will not grow without bound.

    Essentially what cond() does is the following:

    1. Create a Switch node, which forwards its input to only one of two outputs, depending on the value of pred. Let's call the outputs pred_true and pred_false. (They have the same value as pred but that's unimportant since this is never directly evaluated.)

    2. Build the subgraph corresponding to the if_true lambda, where all of the nodes have a control dependency on pred_true.

    3. Build the subgraph corresponding to the if_false lambda, where all of the nodes have a control dependency on pred_false.

    4. Zip together the lists of return values from the two lambdas, and create a Merge node for each of these. A Merge node takes two inputs, of which only one is expected to be produced, and forwards it to its output.

    5. Return the tensors that are the outputs of the Merge nodes.

    This means you can run your second version, and be content that the graph remains a fixed size, regardless of how many steps you run.

    The reason your first version doesn't work is that, when a Tensor is captured (like adder or subtractor in your example), an additional Switch node is added to enforce the logic that the value of the tensor is only forwarded to the branch that actually executes. This is an artifact of how TensorFlow combines feed-forward dataflow and control flow in its execution model. The result is that the captured tensors (in this case the results of the assign_add and assign_sub) will always be evaluated, even if they aren't used, and you'll see their side effects. This is something we need to document better, and as Michael says, we're going to make this more usable in future.

    0 讨论(0)
  • 2020-12-06 18:18

    The second case works because you have added the ops within the cond: this causes them to conditionally execute.

    The first case it is analogous to saying:

    adder = (count += 1)
    subtractor = (count -= 2)
    if (cond) { adder } else { subtractor }
    

    Since adder and subtractor are outside the conditional, they are always executed.

    The second case is more like saying

    if (cond) { adder = (count += 1) } else { subtractor = (count -= 2) }
    

    which in this case does what you expected.

    We realize that the interaction between side effects and (somewhat) lazy evaluation is confusing, and we have a medium-term goal to make things more uniform. But the important thing to understand for now is that we do not do true lazy evaluation: the conditional acquires a dependency on every quantity defined outside the conditional that is used within either branch.

    0 讨论(0)
提交回复
热议问题