speex splitted audio data - WebAudio - VOIP

只谈情不闲聊 提交于 2019-12-05 20:03:22

Speex is a lossy codec, so the output is only an approximation of your initial sine wave.

Your sine frequency is about 7 KHz, which is near the upper codec 8KHz bandwith and as such even more likely to be altered.

What the codec outputs looks like a comb of dirach pulses that will sound like your initial sinusoid as heard through a phone, which is certainly different from the original.

See this fiddle where you can listen to what the codec makes of your original sine waves, be them split in half or not.

//Generate a continus sinus in 2 arrays
var len = 16384;
var buffer1 = [];
var buffer2 = [];
var buffer = [];
for(var i=0;i<len;i++){
    buffer.push(Math.sin(i/10));
    if(i < len/2)
        buffer1.push(Math.sin(i/10));
    else
        buffer2.push(Math.sin(i/10));
}
//Encode and decode both arrays seperatly
var en = Codec.encode(buffer1);
var dec1 = Codec.decode(en);

var en = Codec.encode(buffer2);
var dec2 = Codec.decode(en);

//Merge the arrays to 1 output array
var merge = [];
for(var i in dec1)
    merge.push(dec1[i]);

for(var i in dec2)
    merge.push(dec2[i]);

//encode and decode the whole array
var en = Codec.encode(buffer);
var dec = Codec.decode(en);

//-----------------
//Down under is only for playing the 2 different arrays
//-------------------
var audioCtx = new window.AudioContext || new window.webkitAudioContext;
function play (sound)
{
    var audioBuffer = audioCtx.createBuffer(1, sound.length, 44100);
    var bufferData = audioBuffer.getChannelData(0);
    bufferData.set(sound);

    var source = audioCtx.createBufferSource();
    source.buffer = audioBuffer;
    source.connect(audioCtx.destination);
    source.start();
}

$("#o").click(function() { play(dec); });
$("#c1").click(function() { play(dec1); });
$("#c2").click(function() { play(dec2); });
$("#m").click(function() { play(merge); });

If you merge the two half signal decoder outputs, you will hear an additional click due to the abrupt transition from one signal to the other, sounding basically like a relay commutation.
To avoid that you would have to smooth the values around the merging point of your two buffers.

Kirill Slatin

Note that Speex is a lossy codec. So, by definition, it can't give same result as the encoded buffer. Besides, it designed to be a codec for voice. So the 1-2 kHz range will be the most efficient as it expects a specific form of signal. In some way, it can be compared to JPEG technology for raster images.

I've modified slightly your jsfiddle example so you can play with different parameters and compare results. Just providing a simple sinusoid with an unknown frequency is not a proper way to check a codec. However, in the example you can see different impact on the initial signal at different frequency.

buffer1.push(Math.sin(2*Math.PI*i*frequency/sampleRate));

I think you should build an example with a recorded voice and compare results in this case. It would be more proper.

In general to get the idea in detail you would have to examine digital signal processing. I can't even provide a proper link since it is a whole science and it is mathematically intensive. (the only proper book for reading I know is in Russian). If anyone here with strong mathematics background can share proper literature for this case I would appreciate.

EDIT: as mentioned by Kuroi Neko, there is a trouble with the boundaries of the buffer. And seems like it is impossible to save decoder state as mentioned in this post, because the library in use doesn't support it. If you look at the source code you see that they use a third party speex codec and do not provide full access to it's features. I think the best approach would be to find a decent library for speex that supports state recovery similar to this

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!