问题
whats the primary difference between tf.nn.softmax_cross_entropy_with_logits
and tf.losses.log_loss
? both methods accept 1-hot labels and logits to calculate cross entropy loss for classification tasks.
回答1:
Those methods are not so different in theory, however have number of differences in implementation:
1) tf.nn.softmax_cross_entropy_with_logits
is designed for single-class labels, while tf.losses.log_loss
can be used for multi-class classification. tf.nn.softmax_cross_entropy_with_logits
won't throw an error if you feed multi-class labels, however your gradients won't be calculated correctly and training most probably will fail.
From official documentation:
NOTE: While the classes are mutually exclusive, their probabilities need not be. All that is required is that each row of labels is a valid probability distribution. If they are not, the computation of the gradient will be incorrect.
2) tf.nn.softmax_cross_entropy_with_logits
calculates (as it's seen from the name) soft-max function on top of your predictions first, while log_loss doesn't do this.
3) tf.losses.log_loss
has a little wider functionality in a sense that you can weight each element of the loss function or you can specify epsilon
, which is used in calculations, to avoid log(0) value.
4) Finally, tf.nn.softmax_cross_entropy_with_logits
returns loss for every entry in the batch, while tf.losses.log_loss
returns reduced (sum over all samples by default) value which can be directly used in optimizer.
UPD: Another difference is the way the calculate the loss, Logarithmic loss takes into account negative classes (those where you have 0s in the vector). Shortly, cross-enthropy loss forces network to produce maximum input for the correct class and does not care about negative classes. Logarithmic loss does both at the same time, it forces correct classes to have larger values and negative lesser. In mathematic expression it looks as following:
Cross-enthropy loss:
Logarithmic Loss:
Where i is the corresponding class.
So for example, if you have labels=[1,0] and predictions_with_softmax = [0.7,0.3], then:
1) Cross-Enthropy Loss: -(1 * log(0.7) + 0 * log(0.3)) = 0.3567
2) Logarithmic Loss: - (1*log(0.7) + (1-1) * log(1 - 0.7) +0*log(0.3) + (1-0) log (1- 0.3)) = - (log(0.7) + log (0.7)) = 0.7133
And then if you use default value for tf.losses.log_loss
you then need to divide the log_loss
output by the number of non-zero elements (here it's 2). So finally: tf.nn.log_loss = 0.7133 / 2 = 0.3566
In this case we got equal outputs, however it is not always the case
回答2:
There are basically two differences between,
1) Labels used in tf.nn.softmax_cross_entropy_with_logits
are the one hot version of labels used in tf.losses.log_loss
.
2) tf.nn.softmax_cross_entropy_with_logits
calcultes the softmax of logits internally before the calculation of the cross-entrophy.
Notice that tf.losses.log_loss
also accepts one-hot encoded labels. However, tf.nn.softmax_cross_entropy_with_logits
only accepts the labels with one-hot encoding.
Hope this helps.
来源:https://stackoverflow.com/questions/47245113/whats-the-difference-between-softmax-cross-entropy-with-logits-and-losses-log-l