How to do unsigned saturating addition in C?

前端未结

关注

 17  1725

孤独总比滥情好

What is the best (cleanest, most efficient) way to write saturating addition in C?

The function or macro should add two unsigned inputs (need both 16- and 32-bit ver

相关标签:

17条回答

终归单人心

2020-11-27 03:14
Zero branch solution:
```
uint32_t sadd32(uint32_t a, uint32_t b)
{
    uint64_t s = (uint64_t)a+b;
    return -(s>>32) | (uint32_t)s;
}
```
A good compiler will optimize this to avoid doing any actual 64-bit arithmetic (s>>32 will merely be the carry flag, and -(s>>32) is the result of sbb %eax,%eax).

In x86 asm (AT&T syntax, a and b in eax and ebx, result in eax):
```
add %eax,%ebx
sbb %eax,%eax
or %ebx,%eax
```
8- and 16-bit versions should be obvious. Signed version might require a bit more work.
0 讨论(0)
发布评论:

提交评论
- 加载中...
不思量自难忘°

2020-11-27 03:15
I'm not sure if this is faster than Skizz's solution (always profile), but here's an alternative no-branch assembly solution. Note that this requires the conditional move (CMOV) instruction, which I'm not sure is available on your target.
```
uint32_t sadd32(uint32_t a, uint32_t b)
{
    __asm
    {
        movl eax, a
        addl eax, b
        movl edx, 0xffffffff
        cmovc eax, edx
    }
}
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
谎友^

2020-11-27 03:16
You probably want portable C code here, which your compiler will turn into proper ARM assembly. ARM has conditional moves, and these can be conditional on overflow. The algorithm then becomes: add and conditionally set the destination to unsigned(-1), if overflow was detected.
```
uint16_t add16(uint16_t a, uint16_t b)
{
  uint16_t c = a + b;
  if (c < a)  /* Can only happen due to overflow */
    c = -1;
  return c;
}
```
Note that this differs from the other algorithms in that it corrects overflow, instead of relying on another calculation to detect overflow.

x86-64 clang 3.7 -O3 output for adds32: significantly better than any other answer:
```
add     edi, esi
mov     eax, -1
cmovae  eax, edi
ret
```
ARMv7: gcc 4.8 -O3 -mcpu=cortex-a15 -fverbose-asm output for adds32:
```
adds    r0, r0, r1      @ c, a, b
it      cs
movcs   r0, #-1         @ conditional-move
bx      lr
```
16bit: still doesn't use ARM's unsigned-saturating add instruction (UADD16)
```
add     r1, r1, r0        @ tmp114, a
movw    r3, #65535      @ tmp116,
uxth    r1, r1  @ c, tmp114
cmp     r0, r1    @ a, c
ite     ls        @
movls   r0, r1        @,, c
movhi   r0, r3        @,, tmp116
bx      lr  @
```
0 讨论(0)
发布评论:

提交评论
- 加载中...

既然无缘

2020-11-27 03:17

int saturating_add(int x, int y)
{
    int w = sizeof(int) << 3;
    int msb = 1 << (w-1);

    int s = x + y;
    int sign_x = msb & x;
    int sign_y = msb & y;
    int sign_s = msb & s;

    int nflow = sign_x && sign_y && !sign_s;
    int pflow = !sign_x && !sign_y && sign_s;

    int nmask = (~!nflow + 1);
    int pmask = (~!pflow + 1);

    return (nmask & ((pmask & s) | (~pmask & ~msb))) | (~nmask & msb);
}

This implementation doesn't use control flows, campare operators(==, !=) and the ?: operator. It just uses bitwise operators and logical operators.

0 讨论(0)

悲哀的现实

2020-11-27 03:20
I suppose, the best way for x86 is to use inline assembler to check overflow flag after addition. Something like:
```
add eax, ebx
jno @@1
or eax, 0FFFFFFFFh
@@1:
.......
```
It's not very portable, but IMHO the most efficient way.
0 讨论(0)
发布评论:

提交评论
- 加载中...

上一页 1 2 3