How to generate the spectrogram of a wav file in Matlab with frequency on the x-axis

生来就可爱ヽ(ⅴ<●) 提交于 2019-11-29 11:59:28

Preface

Sorry, I don’t have whichever 💸-toolbox-💰 that Mathworks puts spectrogram in, but here’s some code that I put in the public domain that does the job for me.

It’s more hands-on than spectrogram but has many of the latter’s features, as I’ll demonstrate using the handel audio clip that comes with Matlab (‘Hallelujah!’).

Setup

I won’t assume you’re familiar with git or Matlab namespaces.

  1. Create a directory called +arf somewhere in your Matlab path (e.g., ~/Documents/MATLAB or even your current code directory).
  2. Download stft.m and put it in +arf/.
  3. Also download partition.m into +arf/.

This creates an arf namespace inside which are the arf.stft and arf.partition functions (the latter is used by arf.stft).

Code

clearvars

% Load data: this is an audio clip built into Matlab.
handel = load('handel');
% To hear this audio clip, run the following:
% >> soundsc(handel.y, handel.Fs)

% STFT parameters.
% 1000 samples is roughly 1/8th of a second. A reasonable chunk size.
samplesPerChunk = 1000;
% Overlap a lot between chunks to see a smooth STFT.
overlapSamples = round(samplesPerChunk * 0.9);

% Generate STFT
[stftArr, fVec, tVec] = arf.stft(handel.y, ...
                                 samplesPerChunk, ...
                                 'noverlap', overlapSamples, ...
                                 'fs', handel.Fs);

% Plot results
figure('color', 'white');
imagesc(fVec / 1e3, tVec, 20 * log10(abs(stftArr)).');
axis xy
colorbar
xlabel('frequency (KHz)')
ylabel('time (s)')
caxis(max(caxis) - [40 0])
title('`handel` spectrogram via STFT, top 40 dB')

The code above

  1. loads the handel audio clip that’s packaged into Matlab (this is a nine-second clip from George Frideric Handel’s Messiah),
  2. defines some parameters for the STFT,
  3. evaluates the STFT with arf.stft(), and
  4. plots the STFT.

Hint: after you run the code above, or just that load line, you can listen to the original clip with soundsc(handel.y, handel.Fs).

Results

In the spectrogram, you can clearly see the first two long Hallelujah’s, then the two shorter ones, and then finally the last long one. Time runs along the y-axis as you wished.

The code demonstrates how to specify the chunk length (here, 1000 samples, or ≈⅛ seconds) and the amount of overlap (90% of the chunk length, so 900 samples of overlap). Note:

  • Larger chunk length will result in less resolution in time (but greater resolution in frequency).
  • The less overlap, the more jaggedy and less smooth the STFT appears along time (and the less computational/memory overhead you pay). The amount of overlap must be between 0 (no overlap between chunks) and chunk size - 1.

If you just play around with the chunk length, you’ll get a feel for the main knob the STFT gives you to tune. Usually one picks overlap between 25% or 50% of chunk size for reasonably-smooth spectrograms without a huge amount of computational overhead.

N.B. You can increase smoothness along the frequency dimension by passing in an extra argument to arf.stft, specifically, arf.stft( ..., 'nfft', 2^nextpow2(samplesPerChunk * 8)). This explicitly sets the number of frequency bins to create (eventually, an FFT of this size is evaluated). The default is equivalent to 2^nextpow2(samplesPerChunk), so multiplying it by eight will upsample the spectrum for each chunk eight-fold.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!