Deinterleaving PCM (*.wav) stereo audio data

倾然丶 夕夏残阳落幕 提交于 2019-12-04 05:44:46

A .wav file typically stores its PCM data in little endian format, with 16 bits per sample per channel. For the usual signed 16-bit PCM file, this means that the data is physically stored as

[LEFT LSB] [LEFT MSB] [RIGHT LSB] [RIGHT MSB] ...

so that every group of 4 bytes makes up a single stereo PCM sample. Hence, you can find sample i by looking at bytes 4*i through 4*i+3, inclusive.

To decode a single 16-bit value from two bytes, you do this:

(MSB << 8) | LSB

Because your read buffer values are stored as signed chars, you have to be a bit careful because both MSB and LSB will be sign-extended. This is undesirable for the LSB; therefore, the code uses

0xff & (int)LSB

to obtain the unsigned version of the low byte (technically, this works by upcasting to an int, and selecting the low 8 bits; an alternate formulation would be to just write (uint8_t)LSB).

Note that the MSBs are at indices 1 and 3, and the LSBs are at indices 0 and 2. So,

((readbuffer[i*4+1]<<8) | (0x00ff&(int)readbuffer[i*4]))

and

((readbuffer[i*4+3]<<8) | (0x00ff&(int)readbuffer[i*4+2]))

are just obtaining the values of the left and right channels as 16-bit signed values by using some bit manipulation to assemble the bytes into numbers.

Then, each of these values is divided by 32768.0. Note that a signed 16-bit value has a range of [-32768, 32767]. Thus, dividing by 32768 gives a range of approximately [-1, 1]. The two divided values are added to give a number in the range [-2, 2], and then the whole thing is multiplied by 0.5 to obtain the average (a floating-point value in the range [-1, 1]).

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!