How to change pyplot.specgram x and y axis scaling?

和自甴很熟 提交于 2019-12-05 02:03:46

问题


I have never worked with audio signals before and little do I know about signal processing. Nevertheless, I need to represent and audio signal using pyplot.specgram function from matplotlib library. Here is how I do it.

import matplotlib.pyplot as plt
import scipy.io.wavfile as wavfile

rate, frames = wavfile.read("song.wav")
plt.specgram(frames)

The result I am getting is this nice spectrogram below:

When I look at x-axis and y-axis which I suppose are frequency and time domains I can't get my head around the fact that frequency is scaled from 0 to 1.0 and time from 0 to 80k. What is the intuition behind it and, what's more important, how to represent it in a human friendly format such that frequency is 0 to 100k and time is in sec?


回答1:


  • Firstly, a spectrogram is a representation of the spectral content of a signal as a function of time - this is a frequency-domain representation of the time-domain waveform (e.g. a sine wave, your file "song.wav" or some other arbitrary wave - that is, amplitude as a function of time).

  • The frequency values (y-axis, Hertz) are wholly dependant on the sampling frequency of your waveform ("song.wav") and will range from "0" to "sampling frequency / 2", with the upper limit being the "nyquist frequency" or "folding frequency" (https://en.wikipedia.org/wiki/Aliasing#Folding). The matplotlib specgram function will automatically determine the sampling frequency of the input waveform if it is not otherwise specified, which is defined as 1 / dt, with dt being the time interval between discrete samples of the waveform. You can can pass the option Fs='sampling rate' to the specgram function to manually define what it is. It will be easier for you to get your head around what is going on if you figure out and pass these variables to the specgram function yourself

  • The time values (x-axis, seconds) are purely dependent on the length of your "song.wav". You may notice some whitespace or padding if you use a large window length to calculate each spectra slice (think- the individual spectra which are arranged vertically and tiled horizontally to create the spectrogram image)

  • To make the axes more intuitive in the plot, use x- and y-axes labels and you can also scale the axes values (i.e. change the units) using a method similar to this

Take home message - try to be a bit more verbose with your code: see below for my example.

    import matplotlib.pyplot as plt
    import numpy as np

    # generate a 5Hz sine wave
    fs = 50
    t = np.arange(0, 5, 1.0/fs)
    f0 = 5
    phi = np.pi/2
    A = 1
    x = A * np.sin(2 * np.pi * f0 * t +phi)

    nfft = 25

    # plot x-t, time-domain, i.e. source waveform
    plt.subplot(211)
    plt.plot(t, x)
    plt.xlabel('time')
    plt.ylabel('amplitude')

    # plot power(f)-t, frequency-domain, i.e. spectrogram
    plt.subplot(212)
    # call specgram function, setting Fs (sampling frequency) 
    # and nfft (number of waveform samples, defining a time window, 
    # for which to compute the spectra)
    plt.specgram(x, Fs=fs, NFFT=nfft, noverlap=5, detrend='mean', mode='psd')
    plt.xlabel('time')
    plt.ylabel('frequency')
    plt.show()

5Hz_spectrogram:




回答2:


As others have pointed out, you need to specify the sample rate, else you get a normalised frequency (between 0 and 1) and sample index (0 to 80k). Fortunately this is as simple as:

plt.specgram(frames, Fs=rate)

To expand on Nukolas answer and combining my Changing plot scale by a factor in matplotlib and matplotlib intelligent axis labels for timedelta we can not only get kHz on the frequency axis, but also minutes and seconds on the time axis.

import matplotlib.pyplot as plt
import scipy.io.wavfile as wavfile

cmap = plt.get_cmap('viridis') # this may fail on older versions of matplotlib
vmin = -40  # hide anything below -40 dB
cmap.set_under(color='k', alpha=None)

rate, frames = wavfile.read("song.wav")
fig, ax = plt.subplots()
pxx, freq, t, cax = ax.specgram(frames[:, 0], # first channel
                                Fs=rate,      # to get frequency axis in Hz
                                cmap=cmap, vmin=vmin)
cbar = fig.colorbar(cax)
cbar.set_label('Intensity dB')
ax.axis("tight")

# Prettify
import matplotlib
import datetime

ax.set_xlabel('time h:mm:ss')
ax.set_ylabel('frequency kHz')

scale = 1e3                     # KHz
ticks = matplotlib.ticker.FuncFormatter(lambda x, pos: '{0:g}'.format(x/scale))
ax.yaxis.set_major_formatter(ticks)

def timeTicks(x, pos):
    d = datetime.timedelta(seconds=x)
    return str(d)
formatter = matplotlib.ticker.FuncFormatter(timeTicks)
ax.xaxis.set_major_formatter(formatter)
plt.show()

Result:



来源:https://stackoverflow.com/questions/33680633/how-to-change-pyplot-specgram-x-and-y-axis-scaling

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!