what's the difference between softmax_cross_entropy_with_logits and losses.log_loss?

问题

whats the primary difference between tf.nn.softmax_cross_entropy_with_logits and tf.losses.log_loss? both methods accept 1-hot labels and logits to calculate cross entropy loss for classification tasks.

回答1:

Those methods are not so different in theory, however have number of differences in implementation:

1) tf.nn.softmax_cross_entropy_with_logitsis designed for single-class labels, while tf.losses.log_losscan be used for multi-class classification. tf.nn.softmax_cross_entropy_with_logits won't throw an error if you feed multi-class labels, however your gradients won't be calculated correctly and training most probably will fail.

From official documentation:

NOTE: While the classes are mutually exclusive, their probabilities need not be. All that is required is that each row of labels is a valid probability distribution. If they are not, the computation of the gradient will be incorrect.

2) tf.nn.softmax_cross_entropy_with_logits calculates (as it's seen from the name) soft-max function on top of your predictions first, while log_loss doesn't do this.

3) tf.losses.log_loss has a little wider functionality in a sense that you can weight each element of the loss function or you can specify epsilon, which is used in calculations, to avoid log(0) value.

4) Finally, tf.nn.softmax_cross_entropy_with_logits returns loss for every entry in the batch, while tf.losses.log_loss returns reduced (sum over all samples by default) value which can be directly used in optimizer.

UPD: Another difference is the way the calculate the loss, Logarithmic loss takes into account negative classes (those where you have 0s in the vector). Shortly, cross-enthropy loss forces network to produce maximum input for the correct class and does not care about negative classes. Logarithmic loss does both at the same time, it forces correct classes to have larger values and negative lesser. In mathematic expression it looks as following:

Cross-enthropy loss:

Logarithmic Loss:

Where i is the corresponding class.

So for example, if you have labels=[1,0] and predictions_with_softmax = [0.7,0.3], then:

1) Cross-Enthropy Loss: -(1 * log(0.7) + 0 * log(0.3)) = 0.3567

2) Logarithmic Loss: - (1*log(0.7) + (1-1) * log(1 - 0.7) +0*log(0.3) + (1-0) log (1- 0.3)) = - (log(0.7) + log (0.7)) = 0.7133

And then if you use default value for tf.losses.log_loss you then need to divide the log_loss output by the number of non-zero elements (here it's 2). So finally: tf.nn.log_loss = 0.7133 / 2 = 0.3566

In this case we got equal outputs, however it is not always the case

回答2:

There are basically two differences between,

1) Labels used in tf.nn.softmax_cross_entropy_with_logits are the one hot version of labels used in tf.losses.log_loss.

2) tf.nn.softmax_cross_entropy_with_logits calcultes the softmax of logits internally before the calculation of the cross-entrophy.

Notice that tf.losses.log_loss also accepts one-hot encoded labels. However, tf.nn.softmax_cross_entropy_with_logits only accepts the labels with one-hot encoding.

Hope this helps.

来源：https://stackoverflow.com/questions/47245113/whats-the-difference-between-softmax-cross-entropy-with-logits-and-losses-log-l

标签

tensorflow

cross-entropy