I have the following code:
unsigned char x = 255;
printf(\"%x\\n\", x); // ff
unsigned char tmp = x << 7;
unsign
The 'intermediate' values in your last case are (full) integers, so the bits that are shifted 'out of range' of the original unsigned char
type are retained, and thus they are still set when the result is converted back to a single byte.
From this C11 Draft Standard:
6.5.7 Bitwise shift operators
...
3 The integer promotions are performed on each of the operands. The type of the result is that of the promoted left operand ...
However, in your first case, unsigned char tmp = x << 7;
, the tmp
loses the six 'high' bits when the resultant 'full' integer is converted (i.e. truncated) back to a single byte, giving a value of 0x80
; when this is then right-shifted in unsigned char y = tmp >> 7;
, the result is (as expected) 0x01
.
The shift operator is not defined for the char
types. The value of any char
operand is converted to int
and the result of the expression is converted the char
type.
So, when you put the left and right shift operators in the same expression the calculation will be performed as type int
(without loosing any bit), and the result will be converted to char
.
This little test is actually more subtle than it looks as the behavior is implementation defined:
unsigned char x = 255;
no ambiguity here, x
is an unsigned char
with value 255
, type unsigned char
is guaranteed to have enough range to store 255
.
printf("%x\n", x);
This produces ff
on standard output but it would be cleaner to write printf("%hhx\n", x);
as printf
expects an unsigned int
for conversion %x
, which x
is not. Passing x
might actually pass an int
or an unsigned int
argument.
unsigned char tmp = x << 7;
To evaluate the expression x << 7
, x
being an unsigned char
first undergoes the integer promotions defined in the C Standard 6.3.3.1: If an int
can represent all values of the original type (as restricted by the width, for a bit-field), the value is converted to an int
; otherwise, it is converted to an unsigned int
. These are called the integer promotions.
So if the number of value bits in unsigned char
is smaller or equal to that of int
(the most common case currently being 8 vs 31), x
is first promoted to an int
with the same value, which is then shifted left by 7
positions. The result, 0x7f80
, is guaranteed to fit in the int
type, so the behavior is well defined and converting this value to type unsigned char
will effectively truncate the high order bits of the value. If type unsigned char
has 8 bits, the value will be 128
(0x80
), but if type unsigned char
has more bits, the value in tmp
can be 0x180
, 0x380
, 0x780
, 0xf80
, 0x1f80
, 0x3f80
or even 0x7f80
.
If type unsigned char
is larger than int
, which can occur on rare systems where sizeof(int) == 1
, x
is promoted to unsigned int
and the left shift is performed on this type. The value is 0x7f80U
, which is guaranteed to fit in type unsigned int
and storing that to tmp
does not actually lose any information since type unsigned char
has the same size as unsigned int
. So tmp
would have the value 0x7f80
in this case.
unsigned char y = tmp >> 7;
The evaluation proceeds the same as above, tmp
is promoted to int
or unsigned int
depending on the system, which preserves its value, and this value is shifted right by 7 positions, which is fully defined because 7
is less than the width of the type (int
or unsigned int
) and the value is positive. Depending on the number of bits of type unsigned char
, the value stored in y
can be 1
, 3
, 7
, 15
, 31
, 63
, 127
or 255
, the most common architecture will have y == 1
.
printf("%x\n", y);
again, it would be better t write printf("%hhx\n", y);
and the output may be 1
(most common case) or 3
, 7
, f
, 1f
, 3f
, 7f
or ff
depending on the number of value bits in type unsigned char
.
unsigned char z = (x << 7) >> 7;
The integer promotion is performed on x
as described above, the value (255
) is then shifted left 7 bits as an int
or an unsigned int
, always producing 0x7f80
and then right shifted by 7 positions, with a final value of 0xff
. This behavior is fully defined.
printf("%x\n", z);
Once more, the format string should be printf("%hhx\n", z);
and the output would always be ff
.
Systems where bytes have more than 8 bits are becoming rare these days, but some embedded processors, such as specialized DSPs still do that. It would take a perverse system to fail when passed an unsigned char
for a %x
conversion specifier, but it is cleaner to either use %hhx
or more portably write printf("%x\n", (unsigned)z);
Shifting by 8
instead of 7
in this example would be even more contrived. It would have undefined behavior on systems with 16-bit int
and 8-bit char
.