How to initialize char array using hex numbers?

前端 未结 3 1644
后悔当初
后悔当初 2020-12-17 14:27

I use utf8 and have to save a constant in a char array:

const char s[] = {0xE2,0x82,0xAC, 0}; //the euro sign

However it gives me error:

相关标签:
3条回答
  • 2020-12-17 15:06

    The short answer to your question is that you are overflowing a char. A char has the range of [-128, 127]. 0xE2 = 226 > 127. What you need to use is an unsigned char, which has a range of [0, 255].

    unsigned char s = {0xE2,0x82,0xAC, 0};
    
    0 讨论(0)
  • 2020-12-17 15:21

    char may be signed or unsigned (and the default is implementation specific). You probably want

      const unsigned char s[] = {0xE2,0x82,0xAC, 0}; 
    

    or

      const char s[] = "\xe2\x82\xac";
    

    (a string literal is an array of char unless you give it some prefix)

    See -funsigned-char (or -fsigned-char) option of GCC.

    On some implementations a char is unsigned and CHAR_MAX is 255 (and CHAR_MIN is 0). On others char-s are signed so CHAR_MIN is -128 and CHAR_MAX is 127 (and e.g. things are different on Linux/PowerPC/32 bits and Linux/x86/32 bits). AFAIK nothing in the standard prohibits 19 bits signed chars.

    0 讨论(0)
  • 2020-12-17 15:24

    While it may well be tedious to be putting lots of casts in your code, it actually smells extremely GOOD to me to use as strong of typing as possible.

    As noted above, when you specify type "char" you are inviting a compiler to choose whatever the compiler writer preferred (signed or unsigned). I'm no expert on UTF-8, but there is no reason to make your code non-portable if you don't need to.

    As far as your constants, I've used compilers that default constants written that way to signed ints, as well as compilers that consider the context and interpret them accordingly. Note that converting between signed and unsigned can overflow EITHER WAY. For the same number of bits, a negative overflows an unsigned (obviously) and an unsigned with the top bit set overflows a signed, because the top bit means negative.

    In this case, your compiler is taking your constants as unsigned 8 bit--OR LARGER--which means they don't fit as signed 8 bit. And we are all grateful that the compiler complains (at least I am).

    My perspective is, there is nothing at all bad about casting to show exactly what you intend to happen. And if a compiler lets you assign between signed and unsigned, it should require that you cast regardless of variables or constants. eg

    const int8_t a = (int8_t) 0xFF; // will be -1

    although in my example, it would be better to assign -1. When you are having to add extra casts, they either make sense, or you should code your constants so they make sense for the type you are assigning to.

    0 讨论(0)
提交回复
热议问题