Why do I need to apply a window function to samples when building a power spectrum of an audio signal?

后端 未结 4 1457
青春惊慌失措
青春惊慌失措 2020-12-04 20:21

I have found for several times the following guidelines for getting the power spectrum of an audio signal:

  • collect N samples, where N is a power of 2
相关标签:
4条回答
  • 2020-12-04 20:35

    As @cyco130 says, your samples are already windowed by a rectangular function. Since a Fourier Transform assumes periodicity, any discontinuity between the last sample and the repeated first sample will cause artefacts in the spectrum (e.g. "smearing" of the peaks). This is known as spectral leakage. To reduce the effect of this we apply a tapered window function such as a Hann window which smooths out any such discontinuity and thereby reduces artefacts in the spectrum.

    0 讨论(0)
  • 2020-12-04 20:47

    Note that a non-rectangular window has both benefits and costs. The result of a window in the time-domain is equivalent to a convolution of the window's transform with the signal's spectrum. A typical window, such as a von Hann window, will reduce the "leakage" from any non-periodic spectral content, which will result in a less noisy looking spectrum; but, in return, the convolution will "blur" any exactly or close to periodic spectral peaks across a few adjacent bins. e.g. all the spectral peaks will become rounder looking which may reduce frequency estimation accuracy. If you know, apriori, that there is no non-periodic content (e.g. data from some rotationally synchronous sampling system), a non-rectangular window could actually make the FFT look worse.

    A non-rectangular window is also an informationally lossy process. A significant amount of spectral information near the edges of the window will be thrown away, assuming finite precision arithmetic. So non-rectangular windows are best used with overlapping window processing, and/or when one can assume that the spectrum of interest is either stationary across the entire window width, or centered in the window.

    0 讨论(0)
  • 2020-12-04 20:56

    Most real world audio signals are non-periodic, meaning that real audio signals do not generally repeat exactly, over any given time span.

    However, the math of the Fourier transform assumes that the signal being Fourier transformed is periodic over the time span in question.

    This mismatch between the Fourier assumption of periodicity, and the real world fact that audio signals are generally non-periodic, leads to errors in the transform.

    These errors are called "spectral leakage", and generally manifest as a wrongful distribution of energy across the power spectrum of the signal.

    The plot below shows a closeup of the power spectrum of an acoustic guitar playing the A4 note. The spectrum was calculated with the FFT (Fast Fourier Transform), but the signal was not windowed prior to the FFT.

    Notice the distribution of energy above the -60 dB line, and the three distinct peaks at roughly 440 Hz, 880 Hz, and 1320 Hz. This particular distribution of energy contains "spectral leakage" errors.

    Power spectrum of guitar playing an A4 note, no window was applied

    To somewhat mitigate the "spectral leakage" errors, you can pre-multiply the signal by a window function designed specifically for that purpose, like for example the Hann window function.

    The plot below shows the Hann window function in the time-domain. Notice how the tails of the function go smoothly to zero, while the center portion of the function tends smoothly towards the value 1.

    Hann window function

    Now let's apply the Hann window to the guitar's audio data, and then FFT the resulting signal.

    The plot below shows a closeup of the power spectrum of the same signal (an acoustic guitar playing the A4 note), but this time the signal was pre-multiplied by the Hann window function prior to the FFT.

    Notice how the distribution of energy above the -60 dB line has changed significantly, and how the three distinct peaks have changed shape and height. This particular distribution of spectral energy contains fewer "spectral leakage" errors.

    Power spectrum of guitar playing an A4 note, Hann window was applied

    The acoustic guitar's A4 note used for this analysis was sampled at 44.1 KHz with a high quality microphone under studio conditions, it contains essentially zero background noise, no other instruments or voices, and no post processing.

    References:

    Real audio signal data, Hann window function, plots, FFT, and spectral analysis were done here:

    Fast Fourier Transform, spectral analysis, Hann window function, audio data

    0 讨论(0)
  • 2020-12-04 20:58

    If you're not applying any windowing function, you're actually aplying a rectangular windowing function. Different windowing functions have different characteristics, it depends on what you want exactly.

    0 讨论(0)
提交回复
热议问题