from_logits=True and from_logits=False get different training result for tf.losses.CategoricalCrossentropy for UNet

☆樱花仙子☆ 提交于 2019-12-05 07:46:19

Pushing the "softmax" activation into the cross-entropy loss layer significantly simplifies the loss computation and makes it more numerically stable.
It might be the case that in your example the numerical issues are significant enough to render the training process ineffective for the from_logits=False option.

You can find a derivation of the cross entropy loss (a special case of "info gain" loss) in this post. This derivation illustrates the numerical issues that are averted when combining softmax with cross entropy loss.

I guess the problem comes from the softmax activation function. Looking at the doc I found that sotmax is applied to the last axis by default. Can you look at model.summary() and check if that is what you want ?

For softmax to work properly, you must make sure that:

  • You are using 'channels_last' as Keras default channel config.

    • This means the shapes in the model will be like (None, height, width, channels)
    • This seems to be your case because you are putting n_classes in the last axis. But it's also strange because you are using Conv2D and your output Y should be (1, height, width, n_classes) and not that strange shape you are using.
  • Your Y has only zeros and ones (not 0 and 255 as usually happens to images)

    • Check that Y.max() == 1 and Y.min() == 0
    • You may need to have Y = Y / 255.
  • Only one class is correct (your data does not have more than one path/channel with value = 1).

    • Check that (Y.sum(axis=-1) == 1).all() is True
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!