Adding multiple layers to TensorFlow causes loss function to become Nan

前端未结

关注

 2  1140

I\'m writing a neural-network classifier in TensorFlow/Python for the notMNIST dataset. I\'ve implemented l2 regularization and dropout on the hidden layers. It works fine as

相关标签:

2条回答

死守一世寂寞

2021-02-02 00:43

I had the same problem and reducing the batch size and learning rate worked for me.

0 讨论(0)
发布评论:

提交评论
- 加载中...
遥遥无期

2021-02-02 00:49

Turns out this was not so much a coding issue as a Deep Learning Issue. The extra layer made the gradients too unstable, and that lead to the loss function quickly devolving to NaN. The best way to fix this is to use Xavier initialization. Otherwise, the variance of the initial values will tend to be too high, causing instability. Also, decreasing the learning rate may help.

0 讨论(0)
发布评论:

提交评论
- 加载中...