Adding multiple layers to TensorFlow causes loss function to become Nan

前端 未结 2 1140
半阙折子戏
半阙折子戏 2021-02-01 23:57

I\'m writing a neural-network classifier in TensorFlow/Python for the notMNIST dataset. I\'ve implemented l2 regularization and dropout on the hidden layers. It works fine as

相关标签:
2条回答
  • 2021-02-02 00:43

    I had the same problem and reducing the batch size and learning rate worked for me.

    0 讨论(0)
  • 2021-02-02 00:49

    Turns out this was not so much a coding issue as a Deep Learning Issue. The extra layer made the gradients too unstable, and that lead to the loss function quickly devolving to NaN. The best way to fix this is to use Xavier initialization. Otherwise, the variance of the initial values will tend to be too high, causing instability. Also, decreasing the learning rate may help.

    0 讨论(0)
提交回复
热议问题