Convert PCM wave data to numpy arrays and vice versa

后端 未结 1 1403
孤城傲影
孤城傲影 2021-02-06 05:23

The situation

I am using VAD (Voice Activity Detection) from WebRTC by using WebRTC-VAD, a Python adapter. The example implementation from the GitHub re

相关标签:
1条回答
  • 2021-02-06 06:10

    It seems that WebRTC-VAD, and the Python wrapper, py-webrtcvad, expects the audio data to be 16bit PCM little-endian - as is the most common storage format in WAV files.

    librosa and its underlying I/O library pysoundfile however always returns floating point arrays in the range [-1.0, 1.0]. To convertt this to bytes containing 16bit PCM you can use the following float_to_pcm16 function.

    def float_to_pcm16(audio):
        import numpy
    
        ints = (audio * 32767).astype(numpy.int16)
        little_endian = ints.astype('<u2')
        buf = little_endian.tostring()
        return buf
    
    
    def read_pcm16(path):
        import soundfile
    
        audio, sample_rate = soundfile.read(path)
        assert sample_rate in (8000, 16000, 32000, 48000)
        pcm_data = float_to_pcm16(audio)
        return pcm_data, sample_rate
    
    0 讨论(0)
提交回复
热议问题