Reducing sample bit-depth by truncating

…衆ロ難τιáo~ 提交于 2020-05-26 09:59:10

问题


I have to reduce the bit-depth of a digital audio signal from 24 to 16 bit.

Taking only the 16 most significant bits (i.e. truncating) of each sample is equivalent to doing a proportional calculation (out = in * 0xFFFF / 0xFFFFFF)?


回答1:


I assume you mean (in * 0xFFFF) / 0xFFFFFF, in which case, yes.




回答2:


You'll get better sounding results by adding a carefully crafted noise signal to the original signal, just below the truncating threshold, before truncating (a.k.a. dithering).




回答3:


x * 0xffff / 0xffffff is overly of pedantic, but not in a good way if your samples are signed -- and probably not in a good way in general.

Yes, you want the maximum value in your source range to match the maximum value in your destination range, but the values used there are only for unsigned ranges, and the distribution of quantisation steps means that it'll be very rare that you use the largest possible output value.

If the samples are signed then the peak positive values would be 0x7fff and 0x7fffff, while the peak negative values would be -0x8000 and -0x800000. Your first problem is deciding whether +1 is equal to 0x7fff, or -1 is equal to -0x8000. If you choose the latter then it's a simple shift operation. If you try to have both then zero stops being zero.

After that you have a problem that division rounds towards zero. This means that too many values get rounded to zero compared with other values. This causes distortion.

If you want to scale according to the peak positive values, the correct form would be:

out = rint((float)in * 0x7fff / 0x7fffff);

If you fish around a bit you can probably find an efficient way to do that with integer arithmetic and no division.

This form should correctly round to the nearest available output value for any given input, and it should map the largest possible input value to the largest possible output value, but it's going to have an ugly distribution of quantisation steps scattered throughout the range.

Most people prefer:

out = (in + 128) >> 8;
if (out > 0x7fff) out = 0x7fff;

This form makes things the tiniest bit louder, to the point that positive values may clip slightly, but the quantisation steps are distributed evenly.

You add 128 because right-shift rounds towards negative infinity. The average quantisation error is -128 and you add 128 to correct this to keep 0 at precisely 0. The test for overflow is necessary because an input value of 0x7fffff would otherwise give a result of 0x8000, and when you store this in a 16-bit word it would wrap around giving a peak negative value.

C pedants can poke holes in the assumptions about right-shift and division behaviour, but I'm overlooking those for clarity.

However, as others have pointed out you generally shouldn't reduce the bit depth of audio without dithering, and ideally noise shaping. TPDF dither is as follows:

out = (in + (rand() & 255) - (rand() & 255)) >> 8;
if (out < -0x8000) out = -0x8000;
if (out > 0x7fff) out = 0x7fff;

Again, big issues with the usage of rand() which I'm going to overlook for clarity.




回答4:


Dithering by adding noise will in general give you better results. The key to this is the shape of the noise. The popula pow-r dithering algorithms have a specific shape that is very popular in a lot of digital audio workstation applications (Cakewalk's SONAR, Logic, etc).

If you don't need the full on fidelity of pow-r, you can simply generate some noise at fairly low amplitude and mix it into your signal. You'll find this masks some of the quantization effects.



来源:https://stackoverflow.com/questions/4022838/reducing-sample-bit-depth-by-truncating

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!