what's the difference between softmax_cross_entropy_with_logits and losses.log_loss?

坚强是说给别人听的谎言 提交于 2019-12-06 03:21:45

Those methods are not so different in theory, however have number of differences in implementation:

1) tf.nn.softmax_cross_entropy_with_logitsis designed for single-class labels, while tf.losses.log_losscan be used for multi-class classification. tf.nn.softmax_cross_entropy_with_logits won't throw an error if you feed multi-class labels, however your gradients won't be calculated correctly and training most probably will fail.

From official documentation:

NOTE: While the classes are mutually exclusive, their probabilities need not be. All that is required is that each row of labels is a valid probability distribution. If they are not, the computation of the gradient will be incorrect.

2) tf.nn.softmax_cross_entropy_with_logits calculates (as it's seen from the name) soft-max function on top of your predictions first, while log_loss doesn't do this.

3) tf.losses.log_loss has a little wider functionality in a sense that you can weight each element of the loss function or you can specify epsilon, which is used in calculations, to avoid log(0) value.

4) Finally, tf.nn.softmax_cross_entropy_with_logits returns loss for every entry in the batch, while tf.losses.log_loss returns reduced (sum over all samples by default) value which can be directly used in optimizer.

UPD: Another difference is the way the calculate the loss, Logarithmic loss takes into account negative classes (those where you have 0s in the vector). Shortly, cross-enthropy loss forces network to produce maximum input for the correct class and does not care about negative classes. Logarithmic loss does both at the same time, it forces correct classes to have larger values and negative lesser. In mathematic expression it looks as following:

Cross-enthropy loss:

Logarithmic Loss:

Where i is the corresponding class.

So for example, if you have labels=[1,0] and predictions_with_softmax = [0.7,0.3], then:

1) Cross-Enthropy Loss: -(1 * log(0.7) + 0 * log(0.3)) = 0.3567

2) Logarithmic Loss: - (1*log(0.7) + (1-1) * log(1 - 0.7) +0*log(0.3) + (1-0) log (1- 0.3)) = - (log(0.7) + log (0.7)) = 0.7133

And then if you use default value for tf.losses.log_loss you then need to divide the log_loss output by the number of non-zero elements (here it's 2). So finally: tf.nn.log_loss = 0.7133 / 2 = 0.3566

In this case we got equal outputs, however it is not always the case

There are basically two differences between,

1) Labels used in tf.nn.softmax_cross_entropy_with_logits are the one hot version of labels used in tf.losses.log_loss.

2) tf.nn.softmax_cross_entropy_with_logits calcultes the softmax of logits internally before the calculation of the cross-entrophy.

Notice that tf.losses.log_loss also accepts one-hot encoded labels. However, tf.nn.softmax_cross_entropy_with_logits only accepts the labels with one-hot encoding.

Hope this helps.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!