I am trying to build an image to image regressor via a convolutional autoencoder. The images are 3D images of size (M x M x M) plus the 3 RGB channels (input shape M x M x M