I am doing a final project at campus: pitch estimation from a song using CNN.
Input to CNN is spectrogram of a song, generated by plt.specgram(), with size 3
plt.specgram()