How to remove pops from concatented sound data in PyAudio

前端 未结 3 885
失恋的感觉
失恋的感觉 2021-01-06 11:45

How do you remove \"popping\" and \"clicking\" sounds in audio constructed by concatenating sound tonal sound clips together?

I have this PyAudio code for generating

相关标签:
3条回答
  • 2021-01-06 12:19

    My initial suspicion that the individual waveforms weren't aligning was correct, which I confirmed by inspecting in Audacity. My solution was to modify the code to start and stop each waveform on the peak of the sine wave.

    def tone(self, frequency, length=1000, play=False, **kwargs):
    
        number_of_frames = int(self.bitrate * length/1000.)
    
        record = False
        x = 0
        y = 0
        while 1:
            x += 1
            v = math.sin(x/((self.bitrate/float(frequency))/math.pi))
    
            # Find where the sin tip starts.
            if round(v, 3) == +1:
                record = True
    
            if record:
                self._queue.append(chr(int(v*127+128)))
                y += 1
                if y > number_of_frames and round(v, 3) == +1:
                    # Always end on the high tip of the sin wave to clips align.
                    break
    
    0 讨论(0)
  • 2021-01-06 12:27

    If you are concatenating clips of varying attributes, you may hear clicking sound if peaks of two clips at the points of concatenation does not align.

    One way to get around this is to do Fade-out at the end of first signal and then fade-in at the beginning of second signal. then continue this pattern through rest of the concatenation process. Check here for details on Fading.

    I would try out concatenation in visual tools like Audacity , try Fade-out and fade-in on clips you want to join and play around with timing and settings to get desired results.

    Next, I am not sure pyAudio has any easy way of implementation fading, however, if you can , you may want to try pyDub. It provides easy ways to manipulate audio. It has both Fade-in and Fade-out methods as well as cross-fade method, which basically performs both fade in and out in one step.

    You can install pydub as pip install pydub

    Here is a sample code for pyDub:

    from pydub import AudioSegment
    from pydub.playback import play
    
    #Load first audio segment
    audio1 = AudioSegment.from_wav("SineWave_440Hz.wav")
    
    #Load second audio segment
    audio2 = AudioSegment.from_wav("SineWave_150Hz.wav")
    
    # 1.5 second crossfade
    combinedAudio= audio1.append(audio2, crossfade=1500)
    
    #Play combined Audio
    play(combinedAudio)
    

    Finally, if you really want to get noise / pops cleared at a professional grade, you may want to look at PSOLA (Pitch Synchronous Overlap and Add) . Here one would convert audio signals to frequency domain and then perform PSOLA on chunks to merge the audio with minimum possible noise.

    That was long, but hope it helps.

    0 讨论(0)
  • 2021-01-06 12:29

    The answer you've written for yourself will do the trick but isn't really the correct way to do this type of thing.

    One of the problems is your checking for the "tip" or peak of the sine wave by comparing against 1. Not all sine frequencies will hit that value or may require a large number of cycles to do so.

    Mathematically speaking, the peak of the sine is at sin(pi/2 + 2piK) for all integer values of K.

    To compute sine for a given frequency you use the formula y = sin(2pi * x * f0/fs) where x is the sample number, f0 is the sine frequency and fs is the sample rate.

    For a nice number like 1kHz at 48kHz sample rate, when x=12 then:

    sin(2pi * 12 * 1000/48000) = sin(2pi * 12/48) = sin(pi/2) = 1
    

    However at a frequency like 997Hz then the true peak falls a fraction of a sample after sample 12.

    sin(2pi * 12 * 997/48000) = 0.99087178042
    sin(2pi * 12 * 997/48000) = 0.99998889671
    sin(2pi * 12 * 997/48000) = 0.99209828673
    

    A better method of stitching the waveforms together is to keep track of the phase from one tone and use that as the starting phase for the next.

    First, for a given frequency you need to figure out the phase increment, notice it is the same as what you are doing with the sample factored out:

    phInc = 2*pi*f0/fs
    

    Next, compute the sine and update a variable representing the current phase.

    for x in xrange(number_of_frames):
        y = math.sin(self._phase);
        self._phase += phaseInc;
    

    Putting it all together:

    def tone(self, frequency, length=1000, play=False, **kwargs):
    
        number_of_frames = int(self.bitrate * length/1000.)
        phInc = 2*math.pi*frequency/self.bitrate
    
        for x in xrange(number_of_frames):
            y = math.sin(self._phase)
            _phase += phaseInc;
            self._queue.append(chr(int(y)))
    
    0 讨论(0)
提交回复
热议问题