Why are my TensorFlow network weights and costs NaN when I use RELU activations?

后端未结

关注

 3  1583

天命终不由人 2021-02-01 10:16

I can\'t get TensorFlow RELU activations (neither tf.nn.relu nor tf.nn.relu6) working without NaN values for activations and weights killing my trainin

3条回答

日久生厌 (楼主)

2021-02-01 11:11

If you use a softmax classifier at the top of your network, try to make the initial weights of the layer just below the softmax very small (e.g. std=1e-4). This makes the initial distribution of outputs of the network very soft (high temperature), and helps ensure that the first few steps of your optimization are not too large and numerically unstable.

0 讨论(0)

查看其它3个回答
发布评论:

提交评论
- 加载中...