How to interpret the discriminator's loss and the generator's loss in Generative Adversarial Nets?

谁都会走 提交于 2019-12-03 02:50:30

Unfortunately, like you've said for GANs the losses are very non-intuitive. Mostly it happens down to the fact that generator and discriminator are competing against each other, hence improvement on the one means the higher loss on the other, until this other learns better on the received loss, which screws up its competitor, etc.

Now one thing that should happen often enough (depending on your data and initialisation) is that both discriminator and generator losses are converging to some permanent numbers, like this:

(it's ok for loss to bounce around a bit - it's just the evidence of the model trying to improve itself)

This loss convergence would normally signify that the GAN model found some optimum, where it can't improve more, which also should mean that it has learned well enough. (Also note, that the numbers themselves usually aren't very informative.)

Here are a few side notes, that I hope would be of help:

  • if loss haven't converged very well, it doesn't necessarily mean that the model hasn't learned anything - check the generated examples, sometimes they come out good enough. Alternatively, can try changing learning rate and other parameters.
  • if the model converged well, still check the generated examples - sometimes the generator finds one/few examples that discriminator can't distinguish from the genuine data. The trouble is it always gives out these few, not creating anything new, this is called mode collapse. Usually introducing some diversity to your data helps.
  • as vanilla GANs are rather unstable, I'd suggest to use some version of the DCGAN models, as they contain some features like convolutional layers and batch normalisation, that are supposed to help with the stability of the convergence. (the picture above is a result of the DCGAN rather than vanilla GAN)
  • This is some common sense but still: like with most neural net structures tweaking the model, i.e. changing its parameters or/and architecture to fit your certain needs/data can improve the model or screw it.
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!