Why input is scaled in tf.nn.dropout in tensorflow?

前端 未结 4 1956
自闭症患者
自闭症患者 2021-01-30 13:31

I can\'t understand why dropout works like this in tensorflow. The blog of CS231n says that, \"dropout is implemented by only keeping a neuron active with some probability

4条回答
  •  伪装坚强ぢ
    2021-01-30 13:50

    This scaling enables the same network to be used for training (with keep_prob < 1.0) and evaluation (with keep_prob == 1.0). From the Dropout paper:

    The idea is to use a single neural net at test time without dropout. The weights of this network are scaled-down versions of the trained weights. If a unit is retained with probability p during training, the outgoing weights of that unit are multiplied by p at test time as shown in Figure 2.

    Rather than adding ops to scale down the weights by keep_prob at test time, the TensorFlow implementation adds an op to scale up the weights by 1. / keep_prob at training time. The effect on performance is negligible, and the code is simpler (because we use the same graph and treat keep_prob as a tf.placeholder() that is fed a different value depending on whether we are training or evaluating the network).

提交回复
热议问题