Get GCC To Use Carry Logic For Arbitrary Precision Arithmetic Without Inline Assembly?

When working with arbitrary precision arithmetic (e.g. 512-bit integers), is there any way to get GCC to use ADC and similar instructions without using inline assembly?

A first glance at GMP's sourcecode shows that they simply have assembly implementations for every supported platform.

Here is the test code I wrote, which adds two 128-bit numbers from the command line and prints the result. (Inspired by mini-gmp's add_n):

#include <stdio.h>
#include <stdint.h>
#include <stdlib.h>

int main (int argc, char **argv)
{
    uint32_t a[4];
    uint32_t b[4];
    uint32_t c[4];
    uint32_t carry = 0;

    for (int i = 0; i < 4; ++i)
    {
        a[i] = strtoul (argv[i+1], NULL, 16);
        b[i] = strtoul (argv[i+5], NULL, 16);
    }

    for (int i = 0; i < 4; ++i)
    {
        uint32_t aa = a[i];
        uint32_t bb = b[i];
        uint32_t r = aa + carry;
        carry = (r < carry);
        r += bb;
        carry += (r < bb);
        c[i] = r;
    }

    printf ("%08X%08X%08X%08X + %08X%08X%08X%08X =\n", a[3], a[2], a[1], a[0], b[3], b[2], b[1], b[0]);
    printf ("%08X%08X%08X%08X\n", c[3], c[2], c[1], c[0]);

    return 0;
}

GCC -O3 -std=c99 Does not produce any adc instructions, as checked by objdump. My gcc version is i686-pc-mingw32-gcc (GCC) 4.5.2.

GCC will use the carry flag if it can see that it needs to:
When adding two uint64_t values on a 32-bit machine, for example, this must result in one 32-bit ADD plus one 32-bit ADC. But apart from those cases, where the compiler is forced to use the carry, it probably cannot be persuaded to do so w/o assembler. Therefore, it may be beneficial to use the biggest integer type available to allow GCC to optimize operations by effectively letting it know that the single 'components' of the value belong together.

For the simple addition, another way to calculate the carry could be to look at the relevant bits in the operands, like:

uint32_t aa,bb,rr;
bool msbA, msbB, msbR, carry;
// ...

rr = aa+bb;

msbA = aa >= (1<<31); // equivalent: (aa & (1<<31)) != 0;
msbB = bb >= (1<<31);
msbR = rr >= (1<<31);


carry = (msbA && msbB) || ( !msbR && ( msbA || msbB) );

来源：https://stackoverflow.com/questions/15696540/get-gcc-to-use-carry-logic-for-arbitrary-precision-arithmetic-without-inline-ass

标签

optimization

gcc

compiler-optimization

arbitrary-precision