Truncating an int to char - is it defined?

前端 未结 5 631
说谎
说谎 2020-12-19 11:04
unsigned char a, b;
b = something();
a = ~b;

A static analyzer complained of truncation in the last line, presumably because b is prom

相关标签:
5条回答
  • 2020-12-19 11:33

    The truncation happens as described in 6.3.1.3/2 of the C99 Standard

    ... if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type.


    Example for CHAR_BIT == 8, sizeof (unsigned char) == 1, sizeof (int) == 4

    So, 0x55 is converted to int, to 0x00000055, then negated to 0xFFFFFFAA, and

          0xFFFFFFAA
        + 0x00000100 /* UCHAR_MAX + 1 */
        ------------
          0xFFFFFEAA
    
        ... repeat lots and lots of times ...
    
          0x000000AA
    

    or, as plain 0xAA, as you'd expect

    0 讨论(0)
  • 2020-12-19 11:34

    Lets take the case of Win32 machine.
    Integer is 4 bytes and converting it to a char will result exactly as if left 3 bytes have been removed.

    As you are converting a char to char, it doesn't matter to what is it being promoted to.
    ~b will add 3 bytes at the left change 0s to 1 and then remove... It does not affect your one right byte.

    The same concept will be applicable for different architectures (be it 16 bit or 64 bit machine)

    Assuming it to be little-endian

    0 讨论(0)
  • 2020-12-19 11:53

    It will behave as you want it to. It is safe to cast the value.

    0 讨论(0)
  • 2020-12-19 11:56

    The C standard specifies this for unsigned types:

    A computation involving unsigned operands can never overflow, because a result that cannot be represented by the resulting unsigned integer type is reduced modulo the number that is one greater than the largest value that can be represented by the resulting type.

    In this case, if your unsigned char is 8 bits, it means that the result will be reduced modulo 256, which means that if b was 0x55, a will indeed end up as 0xAA.

    But note that if unsigned char is wider than 8 bits (which is perfectly legal), you will get a different result. To ensure that you will portably get 0xAA as the result, you can use:

    a = ~b & 0xff;
    

    (The bitwise and should be optimised out on platforms where unsigned char is 8 bits).

    Note also that if you use a signed type, the result is implementation-defined.

    0 讨论(0)
  • 2020-12-19 11:57

    This particular code example is safe. But there are reasons to warn against lax use of the ~ operator.

    The reason behind this is that ~ on small integer variables is a potential bug in more complex expressions, because of the implicit integer promotions in C. Imagine if you had an expression like

    a = ~b >> 4;

    It will not shift in zeroes as might have been expected.

    If your static analyzer is set to include MISRA-C, you will for example get this warning for each ~ operator, because MISRA enforces the result of any operation on small integer types to be explicitly typecasted into the expected type, unsigned char in this case.

    0 讨论(0)
提交回复
热议问题