How to avoid overflow in expr. A * B - C * D

后端未结

关注

 15  1011

I need to compute an expression which looks like: A*B - C*D, where their types are: signed long long int A, B, C, D; Each number can be really big (not

相关标签:

15条回答

甜味超标

2021-01-29 18:40
If the result fits in a long long int then the expression A*B-C*D is okay as it performs the arithmetic mod 2^64, and will give the correct result. The problem is to know if the result fits in a long long int. To detect this, you can use the following trick using doubles:
```
if( abs( (double)A*B - (double)C*D ) > MAX_LLONG ) 
    Overflow
else 
    return A*B-C*D;
```
The problem with this approach is that you are limited by the precision of the mantissa of the doubles (54bits?) so you need to limit the products A*B and C*D to 63+54 bits (or probably a little less).
0 讨论(0)
发布评论:

提交评论
- 加载中...
南笙

2021-01-29 18:45
Note that this is not standard since it relies on wrap-around signed-overflow. (GCC has compiler flags which enable this.)

But if you just do all the calculations in long long, the result of applying the formula directly:
(A * B - C * D) will be accurate as long as the correct result fits into a long long.

Here's a work-around that only relies on implementation-defined behavior of casting unsigned integer to signed integer. But this can be expected to work on almost every system today.
```
(long long)((unsigned long long)A * B - (unsigned long long)C * D)
```
This casts the inputs to unsigned long long where the overflow behavior is guaranteed to be wrap-around by the standard. Casting back to a signed integer at the end is the implementation-defined part, but will work on nearly all environments today.

If you need more pedantic solution, I think you have to use "long arithmetic"
0 讨论(0)
发布评论:

提交评论
- 加载中...
梦如初夏

2021-01-29 18:49
You could write each number in an array, each element being a digit and do the calculations as polynomials. Take the resulting polynomial, which is an array, and compute the result by multiplying each element of the array with 10 to the power of the position in the array (the first position being the largest and the last being zero).

The number 123 can be expressed as:
```
123 = 100 * 1 + 10 * 2 + 3
```
for which you just create an array [1 2 3].

You do this for all numbers A, B, C and D, and then you multiply them as polynomials. Once you have the resulting polynomial, you just reconstruct the number from it.
0 讨论(0)
发布评论:

提交评论
- 加载中...
离开以前

2021-01-29 18:51
This seems too trivial I guess. But A*B is the one that could overflow.

You could do the following, without losing precision
```
A*B - C*D = A(D+E) - (A+F)D
          = AD + AE - AD - DF
          = AE - DF
             ^smaller quantities E & F

E = B - D (hence, far smaller than B)
F = C - A (hence, far smaller than C)
```
This decomposition can be done further.
As @Gian pointed out, care might need to be taken during the subtraction operation if the type is unsigned long long.

For example, with the case you have in the question, it takes just one iteration,
```
 MAX * MAX - (MAX - 1) * (MAX + 1)
  A     B       C           D

E = B - D = -1
F = C - A = -1

AE - DF = {MAX * -1} - {(MAX + 1) * -1} = -MAX + MAX + 1 = 1
```
0 讨论(0)
发布评论:

提交评论
- 加载中...

醉酒成梦

2021-01-29 18:51

Choose K = a big number (eg. K = A - sqrt(A))

A*B - C*D = (A-K)*(B-K) - (C-K)*(D-K) + K*(A-C+B-D); // Avoid overflow.

Why?

(A-K)*(B-K) = A*B - K*(A+B) + K^2
(C-K)*(D-K) = C*D - K*(C+D) + K^2

=>
(A-K)*(B-K) - (C-K)*(D-K) = A*B - K*(A+B) + K^2 - {C*D - K*(C+D) + K^2}
(A-K)*(B-K) - (C-K)*(D-K) = A*B - C*D - K*(A+B) + K*(C+D) + K^2 - K^2
(A-K)*(B-K) - (C-K)*(D-K) = A*B - C*D - K*(A+B-C-D)

=>
A*B - C*D = (A-K)*(B-K) - (C-K)*(D-K) + K*(A+B-C-D)

=>
A*B - C*D = (A-K)*(B-K) - (C-K)*(D-K) + K*(A-C+B-D)

Note that Because A, B, C and D are big numbers, thus A-C and B-D are small numbers.

0 讨论(0)

走了就别回头了

2021-01-29 18:52
While a signed long long int will not hold A*B, two of them will. So A*B could be decomposed to tree terms of different exponent, any of them fitting one signed long long int.
```
A1=A>>32;
A0=A & 0xffffffff;
B1=B>>32;
B0=B & 0xffffffff;

AB_0=A0*B0;
AB_1=A0*B1+A1*B0;
AB_2=A1*B1;
```
Same for C*D.

Folowing the straight way, the subraction could be done to every pair of AB_i and CD_i likewise, using an additional carry bit (accurately a 1-bit integer) for each. So if we say E=A*B-C*D you get something like:
```
E_00=AB_0-CD_0 
E_01=(AB_0 > CD_0) == (AB_0 - CD_0 < 0) ? 0 : 1  // carry bit if overflow
E_10=AB_1-CD_1 
...
```
We continue by transferring the upper-half of E_10 to E_20 (shift by 32 and add, then erase upper half of E_10).

Now you can get rid of the carry bit E_11 by adding it with the right sign (obtained from the non-carry part) to E_20. If this triggers an overflow, the result wouldn't fit either.

E_10 now has enough 'space' to take the upper half from E_00 (shift, add, erase) and the carry bit E_01.

E_10 may be larger now again, so we repeat the transfer to E_20.

At this point, E_20 must become zero, otherwise the result won't fit. The upper half of E_10 is empty as result of the transfer too.

The final step is to transfer the lower half of E_20 into E_10 again.

If the expectation that E=A*B+C*D would fit the signed long long int holds, we now have
```
E_20=0
E_10=0
E_00=E
```
0 讨论(0)
发布评论:

提交评论
- 加载中...

1 2 3 下一页