How to obtain sound envelope using python?

后端 未结 2 1383
花落未央
花落未央 2021-02-10 06:33

Hello I new with python and also with sound signal analysis. I am trying to get the envelope of a birth song (zebra finch). It has a very rapid signal fluctuations and I tried w

相关标签:
2条回答
  • 2021-02-10 07:21

    The envelope of a signal can be computed using the absolute value of the corresponding analytic signal. Scipy implements the function scipy.signal.hilbert to compute the analytic signal.

    From its documentation:

    We create a chirp of which the frequency increases from 20 Hz to 100 Hz and apply an amplitude modulation.

    import numpy as np
    import matplotlib.pyplot as plt
    from scipy.signal import hilbert, chirp
    
    duration = 1.0
    fs = 400.0
    samples = int(fs*duration)
    t = np.arange(samples) / fs
    
    signal = chirp(t, 20.0, t[-1], 100.0)
    signal *= (1.0 + 0.5 * np.sin(2.0*np.pi*3.0*t))
    

    The amplitude envelope is given by magnitude of the analytic signal.

    analytic_signal = hilbert(signal)
    amplitude_envelope = np.abs(analytic_signal)
    

    Looks like

    plt.plot(t, signal, label='signal')
    plt.plot(t, amplitude_envelope, label='envelope')
    plt.show()
    

    It can also be used to compute the instantaneous frequency (see documentation).

    0 讨论(0)
  • 2021-02-10 07:35

    Since with a bird song the "modulation frequency" probably will be much lower than the "carrier frequency" even with a rapidly varying amplitude, an approximation to the envelope could be obtained by taking the absolute value of your signal and then applying a moving average filter with say 20 ms length.

    Still, wouldn't you be interested in frequency variations as well, to adequately characterize the song? In that case, taking the Fourier transform over a moving window would give you far more information, namely the approximate frequency content as a function of time. Which is what we humans hear and helps us discriminate between bird species.

    If you don't want the attenuation, you should neither apply a Butterworth filter nor take the moving average, but apply peak detection instead.

    Moving average: Each output sample is the average of the absolute value of e.g. 50 preceding input samples. The output will be attenuated.

    Peak detection: Each output sample is the maximum of the absolute value of e.g. 50 preceding input samples. The output will not be attenuated. You can lowpass filter afterward to get rid of the remaining staircase "riple".

    You wonder why e.g. a Butterworth filter will attenuate your signal. It hardly does if your cutoff frequency is high enough, but it just SEEMS to be strongly attenuated. Your input signal is not the sum of the carrier (whistle) and the modulation (envelope) but the product. Filtering will limit the frequency content. What remains are frequency components (terms) rather than factors. You see an attenuated modulation (envelope) because that frequency component is indeed present in your signal MUCH weaker than the original envelope, since it was not added to your carrier but multiplied with it. Since the carrier sinusoid that your envelope is multiplied with, is not always at its maximum value, the envelope will be "attenuated" by the modulation process, not by your filtering analysis.

    In short: If you directly want the (multiplicative) envelope rather than the (additive) frequency component due to modulation (multiplication) with the envelope, take the peak detection approach.

    Peak detection algorithm in "Pythonish" pseudocode, just to get the idea.

    # Untested, but apart from typos this should work fine
    # No attention paid to speed, just to clarify the algorithm
    # Input signal and output signal are Python lists
    # Listcomprehensions will be a bit faster
    # Numpy will be a lot faster
    
    def getEnvelope (inputSignal):
        
        # Taking the absolute value
        
        absoluteSignal = []
        for sample in inputSignal:
            absoluteSignal.append (abs (sample))
        
        # Peak detection
        
        intervalLength = 50 # Experiment with this number, it depends on your sample frequency and highest "whistle" frequency
        outputSignal = []
        
        for baseIndex in range (intervalLength, len (absoluteSignal)):
            maximum = 0
            for lookbackIndex in range (intervalLength):
                maximum = max (absoluteSignal [baseIndex - lookbackIndex], maximum)
            outputSignal.append (maximum)
        
        return outputSignal
    
    0 讨论(0)
提交回复
热议问题