TensorFlow's ReluGrad claims input is not finite

筅森魡賤 提交于 2019-12-05 07:42:26

Error is due to 0log(0)

This can be avoided by:

cross_entropy = -tf.reduce_sum(y*tf.log(yconv+ 1e-9))
user1111929

Since I had another topic on this issue [ Tensorflow NaN bug? ] I didn't keep this one updated, but the solution has been there for a while and has since been echoed by posters here. The problem is indeed 0*log(0) resulting in an NaN.

One option is to use the line Muaaz suggests here or the one I wrote in the linked topic. But in the end TensorFlow has this routine embedded: tf.nn.softmax_cross_entropy_with_logits and this is more efficient and should hence be preferred when possible. This should be used where possible instead of the things me and Muaaz suggested earlier, as pointed out by a commentor on said link.

I have experienced this error: input is not finite earlier (not with tf.nn.relu). In my case the problem was that the elements in my tensor variable reached very big number (which marked them as infinite and hence the message input is not finite).

I would suggest to add a bunch of debugging output to tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1) at every n-th iteration to track when exactly it reached infinity.

This looks consistent with your comment:

I modify the value to 1e-3, the crash occurs significantly earlier. However, changing it to 1e-5 prevents the algorithm from converging

Can't comment because of reputation, but Muaaz has the answer. The error can be replicated by training a system which has 0 error - resulting in log(0). His solution works to prevent this. Alternatively catch the error and move on.

...your other code...
try :
  for i in range(10000):
    train_accuracy = accuracy.eval(feed_dict={
            x:batch_xs, y_: batch_ys, keep_prob: 1.0})

except : print("training interupted. Hopefully deliberately")
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!