How to portably find out min(INT_MAX, abs(INT_MIN))?

感情迁移 提交于 2019-11-28 11:52:44

While the typical value of INT_MIN is -2147483648, and the typical value of INT_MAX is 2147483647, it is not guaranteed by the standard. TL;DR: The value you're searching for is INT_MAX in a conforming implementation. But calculating min(INT_MAX, abs(INT_MIN)) isn't portable.


The possible values of INT_MIN and INT_MAX

INT_MIN and INT_MAX are defined by the Annex E (Implementation limits) 1 (C standard, C++ inherits this stuff):

The contents of the header are given below, in alphabetical order. The minimum magnitudes shown shall be replaced by implementation-defined magnitudes with the same sign. The values shall all be constant expressions suitable for use in #if preprocessing directives. The components are described further in 5.2.4.2.1.

[...]

#define INT_MAX +32767

#define INT_MIN -32767

[...]

The standard requires the type int to be an integer type that can represent the range [INT_MIN, INT_MAX] (section 5.2.4.2.1.).

Then, 6.2.6.2. (Integer types, again part of the C standard), comes into play and further restricts this to what we know as two's or ones' complement:

For signed integer types, the bits of the object representation shall be divided into three groups: value bits, padding bits, and the sign bit. There need not be any padding bits; signed char shall not have any padding bits. There shall be exactly one sign bit. Each bit that is a value bit shall have the same value as the same bit in the object representation of the corresponding unsigned type (if there are M value bits in the signed type and N in the unsigned type, then M ≤ N). If the sign bit is zero, it shall not affect the resulting value. If the sign bit is one, the value shall be modified in one of the following ways:

— the corresponding value with sign bit 0 is negated (sign and magnitude);

— the sign bit has the value −(2M) (two’s complement);

— the sign bit has the value −(2M − 1) (ones’ complement).

Section 6.2.6.2. is also very important to relate the value representation of the signed integer types with the value representation of its unsigned siblings.

This means, you either get the range [-(2^n - 1), (2^n - 1)] or [-2^n, (2^n - 1)], where n is typically 15 or 31.

Operations on signed integer types

Now for the second thing: Operations on signed integer types, that result in a value that is not within the range [INT_MIN, INT_MAX], the behavior is undefined. This is explicitly mandated in C++ by Paragraph 5/4:

If during the evaluation of an expression, the result is not mathematically defined or not in the range of representable values for its type, the behavior is undefined.

For C, 6.5/5 offers a very similar passage:

If an exceptional condition occurs during the evaluation of an expression (that is, if the result is not mathematically defined or not in the range of representable values for its type), the behavior is undefined.

So what happens if the value of INT_MIN happens to be less than the negative of INT_MAX (e.g. -32768 and 32767 respectively)? Calculating -(INT_MIN) will be undefined, the same as INT_MAX + 1.

So we need to avoid ever calculating a value that may isn't in the range of [INT_MIN, INT_MAX]. Lucky, INT_MAX + INT_MIN is always in that range, as INT_MAX is a strictly positive value and INT_MIN a strictly negative value. Hence INT_MIN < INT_MAX + INT_MIN < INT_MAX.

Now we can check, whether, INT_MAX + INT_MIN is equal to, less than, or greater than 0.

`INT_MAX + INT_MIN`  |  value of -INT_MIN    | value of -INT_MAX 
------------------------------------------------------------------
         < 0         |  undefined            | -INT_MAX
         = 0         |  INT_MAX = -INT_MIN   | -INT_MAX = INT_MIN
         > 0         |  cannot occur according to 6.2.6.2. of the C standard

Hence, to determine the minimum of INT_MAX and -INT_MIN (in the mathematical sense), the following code is sufficient:

if ( INT_MAX + INT_MIN == 0 )
{
    return INT_MAX; // or -INT_MIN, it doesn't matter
}
else if ( INT_MAX + INT_MIN < 0 )
{
    return INT_MAX; // INT_MAX is smaller, -INT_MIN cannot be represented.
}
else // ( INT_MAX + INT_MIN > 0 )
{
    return -INT_MIN; // -INT_MIN is actually smaller than INT_MAX, may not occur in a conforming implementation.
}

Or, to simplify:

return (INT_MAX + INT_MIN <= 0) ? INT_MAX : -INT_MIN;

The values in a ternary operator will only be evaluated if necessary. Hence, -INT_MIN is either left unevaluated (therefore cannot produce UB), or is a well-defined value.

Or, if you want an assertion:

assert(INT_MAX + INT_MIN <= 0);
return INT_MAX;

Or, if you want that at compile time:

static_assert(INT_MAX + INT_MIN <= 0, "non-conforming implementation");
return INT_MAX;

Getting integer operations right (i.e. if correctness matters)

If you're interested in safe integer arithmetic, have a look at my implementation of safe integer operations. If you want to see the patterns (rather than this lengthy text output) on which operations fail and which succeed, choose this demo.

Depending on the architecture, there may be other options to ensure correctness, such as gcc's option -ftrapv.

INT_MAX + INT_MIN < 0 ? INT_MAX : -INT_MIN

Edited to add explanation: Of course the difficulty is that -INT_MIN or abs(INT_MIN) will be undefined if -INT_MIN is too big to fit in an int. So we need some way of checking whether this is the case. The condition INT_MAX + INT_MIN < 0 tests whether -INT_MIN is greater than INT_MAX. If it is, then INT_MAX is the smaller of the two absolute values. If not, then INT_MAX is the larger of the two absolute values, and -INT_MIN is the correct answer.

Jeremy Roman

In C99 and above, INT_MAX.

Quoth the spec:

For signed integer types, the bits of the object representation shall be divided into three groups: value bits, padding bits, and the sign bit. There need not be any padding bits; signed char shall not have any padding bits. There shall be exactly one sign bit. Each bit that is a value bit shall have the same value as the same bit in the object representation of the corresponding unsigned type (if there are M value bits in the signed type and N in the unsigned type, then M ≤ N). If the sign bit is zero, it shall not affect the resulting value. If the sign bit is one, the value shall be modified in one of the following ways:

  • the corresponding value with sign bit 0 is negated (sign and magnitude);
  • the sign bit has the value −(2^M) (two’s complement);
  • the sign bit has the value −(2^M − 1) (ones’ complement).

(Section 6.2.6.2 of http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf)

-INT_MAX is representable as an int in all C and C++ dialects, as far as I know. Therefore:

-INT_MAX <= INT_MIN ? -INT_MIN : INT_MAX

On most systems, abs (INT_MIN) is not defined. For example, on typical 32 bit machines, INT_MAX = 2^31 - 1, INT_MIN = - 2^31, and abs (INT_MIN) cannot be 2^31.

abs(INT_MIN) will invoke undefined behavior. Standard says

7.22.6.1 The abs, labs and llabs functions:

The abs, labs, and llabs functions compute the absolute value of an integer j. If the result cannot be represented, the behavior is undefined.

Try this instead :
Convert INT_MIN to unsignrd int. Since -ve numbers can't be represented as an unsigned int, INT_MAX will be converted to UINT_MAX + 1 + INT_MIN.

#include <stdio.h>
#include <stdlib.h>

unsigned min(unsigned a, unsigned b)
{
    return a < b ? a : b;
}

int main(void)
{
    printf("%u\n", min(INT_MAX, INT_MIN));
}  
标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!