As part of school project, I am trying to train a CNN on football games audio to predict highlights. The data is composed of MFCC Spectrograms (https://librosa.org/doc/main/