signed int x = -5;
unsigned int y = x;
What is the value of y
? How is this so?
The value of y
is UINT_MAX - 5 + 1
, i.e. UINT_MAX - 4
.
When you convert signed integer value to unsigned type, the value is reduced modulo 2^N, where N is the number of value-forming bits in the unsigned type. This applies to both negative and positive signed values.
If you are converting from signed type to unsigned type of the same size, the above means that positive signed values remain unchanged (+5
gets converted to 5
, for example) and negative values get added to MAX + 1
, where MAX
is the maximum value of the unsigned type (-5
gets converted to MAX + 1 - 5
).
From the C99 standard:
6.3.1.3 Signed and unsigned integers
- When a value with integer type is converted to another integer type other than _Bool, if the value can be represented by the new type, it is unchanged.
- Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type. 49)
49) The rules describe arithmetic on the mathematical value, not the value of a given type of expression.
So you'll be looking at, effectively, y = x + UINT_MAX + 1
.
This just happens to mean that the twos-complement representation is used unchanged as an unsigned integer, which makes this very fast on most modern computers, as they use twos-complement for signed integers.
Signed values are typically stored as something called two's complement:
Two's complement numbers are a way to encode negative numbers into ordinary binary, such that addition still works. Adding -1 + 1 should equal 0, but ordinary addition gives the result of 2 or -2 unless the operation takes special notice of the sign bit and performs a subtraction instead. Two's complement results in the correct sum without this extra step.
This means that the actual representation of the numbers -5 and 4294967291 in memory (for a 32 bit word) are identical, e.g: 0xFFFFFFFB
or 0b11111111111111111111111111111011
. So when you do:
unsigned int y = x;
The contents of x is copied verbatim, i.e. bitwise to y
. This means that if you inspect the raw values in memory of x
and y
they will be identical. However if you do:
unsigned long long y1 = x;
the value of x
will be sign-extended before being converted to an unsigned long long. In the common case when long long is 64 bits this means that y1
equals 0xFFFFFFFFFFFFFFFB
.
It's important to note what happens when casting to a larger type. A signed value that is cast to a larger signed value will be sign-extended. This will not happen if the source value is unsigned, e.g.:
unsigned int z = y + 5;
long long z1 = (long long)x + 5; // sign extended since x is signed
long long z2 = (long long)y + 5; // not sign extended since y is unsigned
z
and z1
will equal 0 but z2
will not. This can be remedied by casting the value to signed before expanding it:
long long z3 = (long long)(signed int)y + 5;
or analogically if you don't want the sign extension to occur:
long long z4 = (long long)(unsigned int)x;
y=0xfffffffb it's the binary representation of -5 (two's complement)
It depends on the maximum value of the unsigned int
. Typically, a unsigned int
is 32-bit long, so the UINT_MAX
is 232 − 1. The C standard (§6.3.1.3/2) requires a signed → unsigned conversion be performed as
Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type.
Thus y = x + ((232 − 1) + 1) = 232 − 5 = 4294967291.
In a 2's complement platform, which most implementations are nowadays, y
is also the same as 2's complement representation of x
.
-5 = ~5 + 1 = 0xFFFFFFFA + 1 = 0xFFFFFFFB = 4294967291.