No useful and reliable way to detect integer overflow in C/C++?

问题

No, this is not a duplicate of How to detect integer overflow?. The issue is the same but the question is different.

The gcc compiler can optimize away an overflow check (with -O2), for example:

int a, b;
b = abs(a);                     // will overflow if a = 0x80000000
if (b < 0) printf("overflow");  // optimized away

The gcc people argue that this is not a bug. Overflow is undefined behavior, according to the C standard, which allows the compiler to do anything. Apparently, anything includes assuming that overflow never happens. Unfortunately, this allows the compiler to optimize away the overflow check.

The safe way to check for overflow is described in a recent CERT paper. This paper recommends doing something like this before adding two integers:

if ( ((si1^si2) | (((si1^(~(si1^si2) & INT_MIN)) + si2)^si2)) >= 0) { 
  /* handle error condition */
} else {
  sum = si1 + si2;
}

Apparently, you have to do something like this before every +, -, *, / and other operations in a series of calculations when you want to be sure that the result is valid. For example if you want to make sure an array index is not out of bounds. This is so cumbersome that practically nobody is doing it. At least I have never seen a C/C++ program that does this systematically.

Now, this is a fundamental problem:

Checking an array index before accessing the array is useful, but not reliable.
Checking every operation in the series of calculations with the CERT method is reliable but not useful.
Conclusion: There is no useful and reliable way of checking for overflow in C/C++!

I refuse to believe that this was intended when the standard was written.

I know that there are certain command line options that can fix the problem, but this doesn't alter the fact that we have a fundamental problem with the standard or the current interpretation of it.

Now my question is: Are the gcc people taking the interpretation of "undefined behavior" too far when it allows them to optimize away an overflow check, or is the C/C++ standard broken?

Added note: Sorry, you may have misunderstood my question. I am not asking how to work around the problem - that has already been answered elsewhere. I am asking a more fundamental question about the C standard. If there is no useful and reliable way of checking for overflow then the language itself is dubious. For example, if I make a safe array class with bounds checking then I should be safe, but I'm not if the bounds checking can be optimized away.

If the standard allows this to happen then either the standard needs revision or the interpretation of the standard needs revision.

Added note 2: People here seem unwilling to discuss the dubious concept of "undefined behavior". The fact that the C99 standard lists 191 different kinds of undefined behavior (link) is an indication of a sloppy standard.

Many programmers readily accept the statement that "undefined behavior" gives the license to do anything, including formatting your hard disk. I think it is a problem that the standard puts integer overflow into the same dangerous category as writing outside array bounds.

Why are these two kinds of "undefined behavior" different? Because:

Many programs rely on integer overflow being benign, but few programs rely on writing outside array bounds when you don't know what is there.
Writing outside array bounds actually can do something as bad as formatting your hard disk (at least in an unprotected OS like DOS), and most programmers know that this is dangerous.
When you put integer overflow into the dangerous "anything goes" category, it allows the compiler to do anything, including lying about what it is doing (in the case where an overflow check is optimized away)
An error such as writing outside array bounds can be found with a debugger, but the error of optimizing away an overflow check cannot, because optimization is usually off when debugging.
The gcc compiler evidently refrains from the "anything goes" policy in case of integer overflow. There are many cases where it refrains from optimizing e.g. a loop unless it can verify that overflow is impossible. For some reason, the gcc people have recognized that we would have too many errors if they followed the "anything goes" policy here, but they have a different attitude to the problem of optimizing away an overflow check.

Maybe this is not the right place to discuss such philosophical questions. At least, most answers here are off the point. Is there a better place to discuss this?

回答1:

Ask yourself: how often do you actually need checked arithmetic? If you need it often you should write a checked_int class that overloads the common operators and encapsulate the checks into this class. Props for sharing the implementation on an Open Source website.

Better yet (arguably), use a big_integer class so that overflows can’t happen in the first place.

回答2:

The gcc developers are entirely correct here. When the standard says that the behavior is undefined that means exactly that there are no requirements on the compiler.

As a valid program can not do anything that causes UB (as then it would not be valid anymore), the compiler can very well assume that UB doesn't happen. And if it still does, anything the compiler does would be ok.

For your problem with overflow, one solution is to consider what ranges the caclulations are supposed to handle. For example, when balancing my bank account I can assume that the amounts would be well below 1 billion, so a 32-bit int will work.

For your application domain you can probably do similar estimates about exactly where an overflow could be possible. Then you can add checks at those points or choose another data type, if available.

回答3:

int a, b;
b = abs(a); // will overflow if a = 0x80000000
if (b < 0) printf("overflow");  // optimized away

(You seem to be assuming 2s complement... let's run with that)

Who says abs(a) "overflows" if a has that binary pattern (more accurately, if a is INT_MIN)? The Linux man page for abs(int) says:

Trying to take the absolute value of the most negative integer is not defined.

Not defined doesn't necessarily mean overflow.

So, your premise that b could ever be less than 0, and that's somehow a test for "overflow", is fundamentally flawed from the start. If you want to test, you can not do it on the result that may have undefined behaviour - do it before the operation instead!

If you care about this, you can use C++'s user-defined types (i.e. classes) to implement your own set of tests around the operations you need (or find a library that already does that). The language does not need inbuilt support for this as it can be implemented equally efficiently in such a library, with the resulting semantics of use unchanged. That's fundamental power is one of the great things about C++.

回答4:

Just use the correct type for b:

int a;
unsigned b = a;
if (b == (unsigned)INT_MIN) printf("overflow");  // never optimized away
else b = abs(a);

Edit: Test for overflow in C can be safely done with the unsigned type. Unsigned types just wrap around on arithmetic and signed types are safely converted to them. So you can do any test on them that you like. On modern processors this conversion is usually just a reinterpretation of a register or so, so it comes for no runtime cost.

来源：https://stackoverflow.com/questions/6856227/no-useful-and-reliable-way-to-detect-integer-overflow-in-c-c

标签

c++

overflow

integer-overflow

gcc4