What is the best (cleanest, most efficient) way to write saturating addition in C?
The function or macro should add two unsigned inputs (need both 16- and 32-bit ver
//function-like macro to add signed vals,
//then test for overlow and clamp to max if required
#define SATURATE_ADD(a,b,val) ( {\
if( (a>=0) && (b>=0) )\
{\
val = a + b;\
if (val < 0) {val=0x7fffffff;}\
}\
else if( (a<=0) && (b<=0) )\
{\
val = a + b;\
if (val > 0) {val=-1*0x7fffffff;}\
}\
else\
{\
val = a + b;\
}\
})
I did a quick test and seems to work, but not extensively bashed it yet! This works with SIGNED 32 bit. op : the editor used on the web page does not let me post a macro ie its not understanding non-indented syntax etc!
The best performance will usually involve inline assembly (as some have already stated).
But for portable C, these functions only involve one comparison and no type-casting (and thus I believe optimal):
unsigned saturate_add_uint(unsigned x, unsigned y)
{
if (y > UINT_MAX - x) return UINT_MAX;
return x + y;
}
unsigned short saturate_add_ushort(unsigned short x, unsigned short y)
{
if (y > USHRT_MAX - x) return USHRT_MAX;
return x + y;
}
As macros, they become:
SATURATE_ADD_UINT(x, y) (((y)>UINT_MAX-(x)) ? UINT_MAX : ((x)+(y)))
SATURATE_ADD_USHORT(x, y) (((y)>SHRT_MAX-(x)) ? USHRT_MAX : ((x)+(y)))
I leave versions for 'unsigned long' and 'unsigned long long' as an exercise to the reader. ;-)
Saturation arithmetic is not standard for C, but it's often implemented via compiler intrinsics, so the most efficient way will not be the cleanest. You must add #ifdef
blocks to select the proper way. MSalters's answer is the fastest for x86 architecture. For ARM you need to use __qadd16
function (ARM compiler) of _arm_qadd16
(Microsoft Visual Studio) for 16 bit version and __qadd
for 32-bit version. They'll be automatically translated to one ARM instruction.
Links:
In plain C:
uint16_t sadd16(uint16_t a, uint16_t b) {
return (a > 0xFFFF - b) ? 0xFFFF : a + b;
}
uint32_t sadd32(uint32_t a, uint32_t b) {
return (a > 0xFFFFFFFF - b) ? 0xFFFFFFFF : a + b;
}
which is almost macro-ized and directly conveys the meaning.
In IA32 without conditional jumps:
uint32_t sadd32(uint32_t a, uint32_t b)
{
#if defined IA32
__asm
{
mov eax,a
xor edx,edx
add eax,b
setnc dl
dec edx
or eax,edx
}
#elif defined ARM
// ARM code
#else
// non-IA32/ARM way, copy from above
#endif
}
An alternative to the branch free x86 asm solution is (AT&T syntax, a and b in eax and ebx, result in eax):
add %eax,%ebx
sbb $0,%ebx