Why is 0 < -0x80000000?

≡放荡痞女 提交于 2019-11-26 18:08:48
Lundin

This is quite subtle.

Every integer literal in your program has a type. Which type it has is regulated by a table in 6.4.4.1:

Suffix      Decimal Constant    Octal or Hexadecimal Constant

none        int                 int
            long int            unsigned int
            long long int       long int
                                unsigned long int
                                long long int
                                unsigned long long int

If a literal number can't fit inside the default int type, it will attempt the next larger type as indicated in the above table. So for regular decimal integer literals it goes like:

  • Try int
  • If it can't fit, try long
  • If it can't fit, try long long.

Hex literals behave differently though! If the literal can't fit inside a signed type like int, it will first try unsigned int before moving on to trying larger types. See the difference in the above table.

So on a 32 bit system, your literal 0x80000000 is of type unsigned int.

This means that you can apply the unary - operator on the literal without invoking implementation-defined behavior, as you otherwise would when overflowing a signed integer. Instead, you will get the value 0x80000000, a positive value.

bal < INT32_MIN invokes the usual arithmetic conversions and the result of the expression 0x80000000 is promoted from unsigned int to long long. The value 0x80000000 is preserved and 0 is less than 0x80000000, hence the result.

When you replace the literal with 2147483648L you use decimal notation and therefore the compiler doesn't pick unsigned int, but rather tries to fit it inside a long. Also the L suffix says that you want a long if possible. The L suffix actually has similar rules if you continue to read the mentioned table in 6.4.4.1: if the number doesn't fit inside the requested long, which it doesn't in the 32 bit case, the compiler will give you a long long where it will fit just fine.

0x80000000 is an unsigned literal with value 2147483648.

Applying the unary minus on this still gives you an unsigned type with a non-zero value. (In fact, for a non-zero value x, the value you end up with is UINT_MAX - x + 1.)

This integer literal 0x80000000 has type unsigned int.

According to the C Standard (6.4.4.1 Integer constants)

5 The type of an integer constant is the first of the corresponding list in which its value can be represented.

And this integer constant can be represented by the type of unsigned int.

So this expression

-0x80000000 has the same unsigned int type. Moreover it has the same value 0x80000000 in the two's complement representation that calculates the following way

-0x80000000 = ~0x80000000 + 1 => 0x7FFFFFFF + 1 => 0x80000000

This has a side effect if to write for example

int x = INT_MIN;
x = abs( x );

The result will be again INT_MIN.

Thus in in this condition

bal < INT32_MIN

there is compared 0 with unsigned value 0x80000000 converted to type long long int according to the rules of the usual arithmetic conversions.

It is evident that 0 is less than 0x80000000.

The numeric constant 0x80000000 is of type unsigned int. If we take -0x80000000 and do 2s compliment math on it, we get this:

~0x80000000 = 0x7FFFFFFF
0x7FFFFFFF + 1 = 0x80000000

So -0x80000000 == 0x80000000. And comparing (0 < 0x80000000) (since 0x80000000 is unsigned) is true.

A point of confusion occurs in thinking the - is part of the numeric constant.

In the below code 0x80000000 is the numeric constant. Its type is determine only on that. The - is applied afterward and does not change the type.

#define INT32_MIN        (-0x80000000)
long long bal = 0;
if (bal < INT32_MIN )

Raw unadorned numeric constants are positive.

If it is decimal, then the type assigned is first type that will hold it: int, long, long long.

If the constant is octal or hexadecimal, it gets the first type that holds it: int, unsigned, long, unsigned long, long long, unsigned long long.

0x80000000, on OP's system gets the type of unsigned or unsigned long. Either way, it is some unsigned type.

-0x80000000 is also some non-zero value and being some unsigned type, it is greater than 0. When code compares that to a long long, the values are not changed on the 2 sides of the compare, so 0 < INT32_MIN is true.


An alternate definition avoids this curious behavior

#define INT32_MIN        (-2147483647 - 1)

Let us walk in fantasy land for a while where int and unsigned are 48-bit.

Then 0x80000000 fits in int and so is the type int. -0x80000000 is then a negative number and the result of the print out is different.

[Back to real-word]

Since 0x80000000 fits in some unsigned type before a signed type as it is just larger than some_signed_MAX yet within some_unsigned_MAX, it is some unsigned type.

C has a rule that the integer literal may be signed or unsigned depends on whether it fits in signed or unsigned (integer promotion). On a 32-bit machine the literal 0x80000000 will be unsigned. 2's complement of -0x80000000 is 0x80000000 on a 32-bit machine. Therefore, the comparison bal < INT32_MIN is between signed and unsigned and before comparison as per the C rule unsigned int will be converted to long long.

C11: 6.3.1.8/1:

[...] Otherwise, if the type of the operand with signed integer type can represent all of the values of the type of the operand with unsigned integer type, then the operand with unsigned integer type is converted to the type of the operand with signed integer type.

Therefore, bal < INT32_MIN is always true.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!