I\'m currently implementing a CVAE with the mnist dataset, but i didn\'t get what is the difference for the model in concatenate the image with the labels as one hot encodin