I have built a fully convolutional network that I feed subnetwork A with MFCC coefficients The wav files where MFCCs are calculated from have variable duration, so every wav