detecting multiplication of uint64_t integers overflow with C

问题

Is there any efficient and portable way to check when multiplication operations with int64_t or uint64_t operands overflow in C?

For instance, for addition of uint64_t I can do:

if (UINT64_MAX - a < b) overflow_detected();
else sum = a + b;

But I can not get to a similar simple expression for multiplication.

All that occurs to me is breaking the operands into high and low uint32_t parts and performing the multiplication of those parts while checking for overflow, something really ugly and probably inefficient too.

Update 1: Some benchmark code implementing several approaches added

Update 2: Jens Gustedt method added

benchmarking program:

#include <stdio.h>
#include <stdint.h>
#include <stdlib.h>

#define N 100000000

int d = 2;

#define POW_2_64 ((double)(1 << 31) * (double)(1 << 31) * 4)

#define calc_b (a + c)
// #define calc_b (a + d)

int main(int argc, char *argv[]) {
    uint64_t a;
    uint64_t c = 0;
    int o = 0;
    int opt;

    if (argc != 2) exit(1);

    opt = atoi(argv[1]);

    switch (opt) {

    case 1: /* faked check, just for timing */
        for (a = 0; a < N; a++) {
            uint64_t b = a + c;
            if (c > a) o++;
            c += b * a;
        }
        break;

    case 2: /* using division */
        for (a = 0; a < N; a++) {
            uint64_t b = a + c;
            if (b && (a > UINT64_MAX / b)) o++;
            c += b * a;
        }
        break;

    case 3: /* using floating point, unreliable */
        for (a = 0; a < N; a++) {
            uint64_t b = a + c;
            if ((double)UINT64_MAX < (double)a * (double)b) o++;
            c += b * a;
        }
        break;

    case 4: /* using floating point and division for difficult cases */
        for (a = 0; a < N; a++) {
            uint64_t b = a + c;
            double m = (double)a * (double)b;
            if ( ((double)(~(uint64_t)(0xffffffff)) < m ) &&
                 ( (POW_2_64 < m) ||
                   ( b &&
                     (a > UINT64_MAX / b) ) ) ) o++;
            c += b * a;
        }
        break;

    case 5: /* Jens Gustedt method */
        for (a = 0; a < N; a++) {
            uint64_t b = a + c;
            uint64_t a1, b1;
            if (a > b) { a1 = a; b1 = b; }
            else       { a1 = b; b1 = a; }
            if (b1 > 0xffffffff) o++;
            else {
                uint64_t a1l = (a1 & 0xffffffff) * b1;
                uint64_t a1h = (a1 >> 32) * b1 + (a1l >> 32);
                if (a1h >> 32) o++;
            }
            c += b1 * a1;
        }
        break;

    default:
        exit(2);
    }
    printf("c: %lu, o: %u\n", c, o);
}

So far, case 4 that uses floating point to filter most cases is the fastest when it is assumed that overflows are very unusual, at least on my computer where it is only two times slower than the do-nothing case.

Case 5, is 30% slower than 4, but it always performs the same, there isn't any special case numbers that require slower processing as happens with 4.

回答1:

If you want to avoid division as in Ambroz' answer:

First you have to see that the smaller of the two numbers, say a, is less than 2³², otherwise the result will overflow anyhow. Let b be decomposed into the two 32 bit words that is b = c 2³² + d.

The computation then is not so difficult, I find:

uint64_t mult_with_overflow_check(uint64_t a, uint64_t b) {
  if (a > b) return mult_with_overflow_check(b, a);
  if (a > UINT32_MAX) overflow();
  uint32_t c = b >> 32;
  uint32_t d = UINT32_MAX & b;
  uint64_t r = a * c;
  uint64_t s = a * d;
  if (r > UINT32_MAX) overflow();
  r <<= 32;
  return addition_with_overflow_check(s, r);
}

so this are two multiplications, two shifts, some additions and condition checks. This could be more efficient than the division because e.g the two multiplications can be pipelined in paralle. You'd have to benchmark to see what works better for you.

回答2:

Actually, the same principle can be used for multiplication:

uint64_t a;
uint64_t b;
...
if (b != 0 && a > UINT64_MAX / b) { // if you multiply by b, you get: a * b > UINT64_MAX
    < error >
}
uint64_t c = a * b;

For signed integers similar can be done, you'd probably need a case for each combination of signs.

回答3:

Related question with some (hopefully) useful answers: Best way to detect integer overflow in C/C++. Plus it not covers uint64_t only ;)

回答4:

case 6:
    for (a = 0; a < N; a++) {
        uint64_t b = a + c;
        uint64_t a1, b1;
        if (a > b) { a1 = a; b1 = b; }
        else       { a1 = b; b1 = a; }
        uint64_t cc = b1 * a1;
        c += cc;
        if (b1 > 0xffffffff) o++;
        else {
            uint64_t a1l = (a1 & 0xffffffff) + (a1 >> 32);
            a1l = (a1 + (a1 >> 32)) & 0xffffffff;
            uint64_t ab1l = a1l * b1;
            ab1l = (ab1l & 0xffffffff) + (ab1l >> 32);
            ab1l += (ab1l >> 32);
            uint64_t ccl = (cc & 0xffffffff) + (cc >> 32);
            ccl += (ccl >> 32);
            uint32_t ab32 = ab1l; if (ab32 == 0xffffffff) ab32 = 0;
            uint32_t cc32 = ccl; if (cc32 == 0xffffffff) cc32 = 0;
            if (ab32 != cc32) o++;
        }
    }
    break;

This method compares (possibly overflowing) result of normal multiplication with the result of multiplication, which cannot overflow. All calculations are modulo (2^32 - 1).

It is more complicated and (most likely) not faster than Jens Gustedt's method.

After some small modifications it may multiply with 96-bit precision (but without overflow control). What may be more interesting, the idea of this method may be used to check overflow for a series of arithmetic operations (multiplications, additions, subtractions).

Some questions answered

First of all, about "your code is not portable". Yes, code is not portable because it uses uint64_t, which is requested in the original question. Strictly speaking, you cannot get any portable answer with (u)int64_t because it is not required by standard.

About "once some overflow happens, you can not assume the result value to be anything". Standard says that unsigned itegers cannot overflow. Chapter 6.2.5, item 9:

A computation involving unsigned operands can never overflow, because a result that cannot be represented by the resulting unsigned integer type is reduced modulo the number that is one greater than the largest value that can be represented by the resulting type.

So unsigned 64-bit multiplication is performed modulo 2^64, without overflow.

Now about the "logic behind". "Hash function" is not the correct word here. I only use calculations modulo (2^32 - 1). The result of multiplication may be represented as n*2^64 + m, where m is the visible result, and n means how much we overflow. Since 2^64 = 1 (mod 2^32 - 1), we may calculate [true value] - [visible value] = (n*2^64 + m) - m = n*2^64 = n (mod 2^32 - 1). If calculated value of n is not zero, there is an overflow. If it is zero, there is no overflow. Any collisions are possible only after n >= 2^32 - 1. This will never happen since we check that one of the multiplicands is less than 2^32.

回答5:

It might not detect exact overflows, but in general you can test the result of your multiplication on a logarithmic scale:

if (log(UINT64_MAX-1) - log(a) - log(b) < 0) overflow_detected(); // subtracting 1 to allow some tolerance when the numbers are converted to double
else prod = a * b;

It depends if you really need to do multiplication up to exact UINT64_MAX, otherwise this a very portable and convenient way to check multiplications of large numbers.

来源：https://stackoverflow.com/questions/8534107/detecting-multiplication-of-uint64-t-integers-overflow-with-c

标签

multiplication

integer-overflow