How do I apply a binary mask and STFT to produce an audio file?
问题 So here's the idea: you can generate a spectrogram from an audio file using shorttime Fourier transform (stft). Then some people have generated something called a "binary mask" to generate different audio (ie. with background noise removed etc.) from the inverse stft. Here's what I understand: stft is a simple equation that is applied to the audio file, which generates the information that can easily be displayed a spectrogram. By taking the inverse of the stft matrix, and multiplying it by a