Simple Character Interpretation In C

后端 未结 9 739
暖寄归人
暖寄归人 2021-01-06 12:54

Here is my code

 #include

 void main()
 {
     char ch = 129;
     printf(\"%d\", ch);
 }

I get the output as -127. What d

相关标签:
9条回答
  • 2021-01-06 13:16

    It means that char is an 8-bit variable that can only hold 2^8 = 256 values, since the declaration is char ch, ch is a signed variable, which means it can store 127 negative and positive values. when you ask to go over 127 then the value starts over from -128.

    Think of it like some arcade games where you go from one side of the screen to the other:

    ch = 50;

                                        ----->                        50 is stored
          |___________________________________|___________|           since it fits
        -128                       0         50          127          between -127
                                                                      and 128
    

    ch = 129;

                                                        ---           129 goes over
          -->                                                         127 by 2, so
          |__|____________________________________________|           it 'lands' in
        -128  -127                 0                     127          -127
    

    BUT!! you shouldn't rely on this since it's undefined behaviour!


    In honor of Luchian Grigore here's the bit representation of what's happening:

    A char is a variable that will hold 8-bits or a byte. So we have 8 0's and 1's struggling to represent whatever value you desire. If the char is a signed variable it will represent whether it's a positive or negative number. You probably read about the one bit representing the sign, that's an abstraction of the true process; in fact it is only one of the first solutions implemented in electronics. But such a trivial method had a problem, you would have 2 ways of representing 0 (+0 and -0):

    0 0000000     ->    +0        1 0000000     ->    -0                    
    ^                             ^ 
    |_ sign bit 0: positive       |_ sign bit 1: negative
    

    Inconsistencies guaranteed!! So, some very smart folks came up with a system called Ones' Complement which would represent a negative number as the negation (NOT operation) of its positive counterpart:

    01010101      ->    +85
    10101010      ->    -85
    

    This system... had the same problem. 0 could be represented as 00000000 (+0) and 11111111 (-0). Then came some smarter folks who created Two's Complement, which would hold the negation part of the earlier method and then add 1, therefore removing that pesky -0 and giving us a shiny new number to our range: -128!. So how does our range look now?

    00000000     +0
    00000001     +1
    00000010     +2
    ...
    01111110     +126
    01111111     +127
    10000000     -128
    10000001     -127
    10000010     -126
    ...
    11111110     -2
    11111111     -1
    

    So, this should give an idea of what's happening when our little processor tries to add numbers to our variable:

     0110010     50                   01111111     127
    +0000010    + 2                  +00000010    +  2
     -------     --                   --------     ---
     0110100     52                   10000001    -127
         ^                                  ^       ^
         |_ 1 + 1 = 10          129 in bin _|       |_ wait, what?!
    

    Yep, if you review the range table above you can see that up to 127 (01111111) the binary was fine and dandy, nothing weird happening, but after the 8'th bit is set at -128 (10000000) the number interpreted no longer held to its binary magnitude but to the Two's Complement representation. This means, the binary representation, the bits in your variable, the 1's and 0's, the heart of our beloved char, does hold a 129... its there, look at it! But the evil processor reads that as measly -127 cause the variable HAD to be signed undermining all its positive potential for a smelly shift through the real number line in the Euclidean space of dimension one.

    0 讨论(0)
  • 2021-01-06 13:18

    On your system: char 129 has the same bits as the 8 bit signed integer -127. An unsigned integer goes from 0 to 255, and signed integer -128 to 127.

    Related (C++):

    You may also be interested in reading the nice top answer to What is an unsigned char?

    As @jmquigley points out. This is strictly undefined behavior and you should not rely on it. Allowing signed integer overflows in C/C++

    0 讨论(0)
  • 2021-01-06 13:19

    char is 8 bits, signed. It can only hold values -128 to 127. When you try and assign 129 to it you get the result you see because the bit that indicates signing is flipped. Another way to think of it is that the number "wraps" around.

    0 讨论(0)
  • 2021-01-06 13:23

    It means you ran into undefined behavior.

    Any outcome is possible.

    char ch=129; is UB because 129 is not a representable value for a char for you specific setup.

    0 讨论(0)
  • 2021-01-06 13:26

    Whether a plain char is signed or unsigned, is implementation-defined behavior. This is a quite stupid, obscure rule in the C language. int, long etc are guaranteed to be signed, but char could be signed or unsigned, it is up to the compiler implementation.

    On your particular compiler, char is apparently signed. This means, assuming that your system uses two's complement, that it can hold values of -128 to 127.

    You attempt to store the value 129 in such a variable. This leads to undefined behavior, because you get an integer overflow. Strictly speaking, anything can happen when you do this. The program could print "hello world" or start shooting innocent bystanders, and still conform to ISO C. In practice, most (all?) compilers will however implement this undefined behavior as "wrap around", as described in other answers.

    To sum it up, your code relies on two different behaviors that aren't well defined by the standard. Understanding how the result of such unpredictable code ends up in a certain way has limited value. The important thing here is to recognize that the code is obscure, and learn how to write it in a way that isn't obscure.

    The code could for example be rewritten as:

    unsigned char ch = 129;

    Or even better:

    #include <stdint.h>
    ...
    uint8_t ch = 129;
    

    As a rule of thumb, make sure to follow these rules in MISRA-C:2004:

    6.1 The plain char type shall be used only for the storage and use of character values.

    6.2 signed and unsigned char type shall be used only for the storage and use of numeric values.

    0 讨论(0)
  • 2021-01-06 13:27

    Your char is most likely an 8-bit signed integer that is stored using Two's complement. Such a variable can only represent numbers between -128 and 127. If you do "127+1" it wraps around to -128. So 129 is equivalent to -127.

    0 讨论(0)
提交回复
热议问题