I am applying deep learning to a voice conversion task (i.e. making the source speaker sound like the target speaker). Before touching GANs, I had a baseline (i.e. only generato