speex decoding going wrong

纵然是瞬间 提交于 2019-12-04 15:35:42

I found the reason the encoded data was so different. There is the fact it's a lossy compression as Paulo Scardine said, and also that speex only works with 160 frames, so when getting data from portaudio to speex, it needs to be by "packets" of 160 frames.

Actually speaks introduces an additional delay to the audio data, I found out by reverse enginiering:

narrow band : delay = 200 - framesize + lookahead = 200 - 160 +  40 =  80 samples 

wide band   : delay = 400 - framesize + lookahead = 400 - 320 + 143 = 223 samples

uwide band  : delay = 800 - framesize + lookahead = 800 - 640 + 349 = 509 samples

Since the lookahead is initialized with zereos, you observe the first few samples to be "close to zero".

To get the timing right, you must skip those samples before you get the actual audio data you have feeded into the codec. Why that is, I dont know. Probalby the author of speex never cared about this since speex is for streaming, not primarily for storing and restoring audio data. Another workaround (to not waste space) is, you feed (framesize-delay) zeroes into the codec, before feeding your actual audio data, and then dropping the entire first speex-frame.

I hope this clarifies everything. If someone familiar with Speex reads this, feel free to correct me if I am wrong.

EDIT: Actually, decoder and encoder have both a lookahead time. The actual formula for the delay is:

narrow band : delay = decoder_lh + encoder_lh =  40 +  40 =  80 samples 

wide band   : delay = decoder_lh + encoder_lh =  80 + 143 = 223 samples

uwide band  : delay = decoder_lh + encoder_lh = 160 + 349 = 509 samples

You may want to have a look here for some simple encoding/decoding: http://www.speex.org/docs/manual/speex-manual/node13.html#SECTION001310000000000000000

Since you are using UDP you may also work with a jitter buffer to re-order packets and stuff.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!