Mixing 16 bit linear PCM streams and avoiding clipping/overflow

廉价感情. 提交于 2019-12-02 18:53:48

here's a descriptive implementation:

short int mix_sample(short int sample1, short int sample2) {
    const int32_t result(static_cast<int32_t>(sample1) + static_cast<int32_t>(sample2));
    typedef std::numeric_limits<short int> Range;
    if (Range::max() < result)
        return Range::max();
    else if (Range::min() > result)
        return Range::min();
    else
        return result;
}

to mix, it's just add and clip!

to avoid clipping artifacts, you will want to use saturation or a limiter. ideally, you will have a small int32_t buffer with a small amount of lookahead. this will introduce latency.

more common than limiting everywhere, is to leave a few bits' worth of 'headroom' in your signal.

The best solution I have found is given by Viktor Toth. He provides a solution for 8-bit unsigned PCM, and changing that for 16-bit signed PCM, produces this:

int a = 111; // first sample (-32768..32767)
int b = 222; // second sample
int m; // mixed result will go here

// Make both samples unsigned (0..65535)
a += 32768;
b += 32768;

// Pick the equation
if ((a < 32768) || (b < 32768)) {
    // Viktor's first equation when both sources are "quiet"
    // (i.e. less than middle of the dynamic range)
    m = a * b / 32768;
} else {
    // Viktor's second equation when one or both sources are loud
    m = 2 * (a + b) - (a * b) / 32768 - 65536;
}

// Output is unsigned (0..65536) so convert back to signed (-32768..32767)
if (m == 65536) m = 65535;
m -= 32768;

Using this algorithm means there is almost no need to clip the output as it is only one value short of being within range. Unlike straight averaging, the volume of one source is not reduced even when the other source is silent.

Here is what I did on my recent synthesizer project.

int* unfiltered = (int *)malloc(lengthOfLongPcmInShorts*4);
int i;
for(i = 0; i < lengthOfShortPcmInShorts; i++){
    unfiltered[i] = shortPcm[i] + longPcm[i];
}
for(; i < lengthOfLongPcmInShorts; i++){
     unfiltered[i] = longPcm[i];
}

int max = 0;
for(int i = 0; i < lengthOfLongPcmInShorts; i++){
   int val = unfiltered[i];
   if(abs(val) > max)
      max = val;
}

short int *newPcm = (short int *)malloc(lengthOfLongPcmInShorts*2);
for(int i = 0; i < lengthOfLongPcmInShorts; i++){
   newPcm[i] = (unfilted[i]/max) * MAX_SHRT;
}

I added all the PCM data into an integer array, so that I get all the data unfiltered.

After doing that I looked for the absolute max value in the integer array.

Finally, I took the integer array and put it into a short int array by taking each element dividing by that max value and then multiplying by the max short int value.

This way you get the minimum amount of 'headroom' needed to fit the data.

You might be able to do some statistics on the integer array and integrate some clipping, but for what I needed the minimum amount of headroom was good enough for me.

I think they should be functions mapping [MIN_SHORT, MAX_SHORT] -> [MIN_SHORT, MAX_SHORT] and they are clearly not (besides first one), so overflows occurs.

If unwind's proposition won't work you can also try:

((long int)(sample1) + sample2) / 2

Since you are in time domain the frequency info is in the difference between successive samples, when you divide by two you damage that information. That's why adding and clipping works better. Clipping will of course add very high frequency noise which is probably filtered out.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!