Endianness — why do chars put in an Int16 print backwards?

心不动则不痛 提交于 2019-12-31 03:55:09

问题


The following C code, compiled and run in XCode:

UInt16 chars = 'ab';
printf("\nchars: %2.2s", (char*)&chars);

prints 'ba', rather than 'ab'.

Why?


回答1:


That particular implementation seems to store multi-character constants in little-endian format. In the constant 'ab' the character 'b' is the least significant byte (the little end) and the character 'a' is the most significant byte. If you viewed chars as an array, it'd be chars[0] = 'b' and chars[1] = 'a', and thus would be treated by printf as "ba".

Also, I'm not sure how accurate you consider Wikipedia, but regarding C syntax it has this section:

Multi-character constants (e.g. 'xy') are valid, although rarely useful — they let one store several characters in an integer (e.g. 4 ASCII characters can fit in a 32-bit integer, 8 in a 64-bit one). Since the order in which the characters are packed into one int is not specified, portable use of multi-character constants is difficult.

So it appears the 'ab' multi-character constant format should be avoided in general.




回答2:


It depends on the system you're compiling/running your program on.

Obviously on your system, the short value is stored in memory as 0x6261 (ba): the little endian way.

When you ask to decode a string, printf will read byte by byte the value you have stored in memory, which actually is 'b', then 'a'. Thus your result.




回答3:


Multicharacter character literals are implementation-defined:

C99 6.4.4.4p10: "The value of an integer character constant containing more than one character (e.g., 'ab'), or containing a character or escape sequence that does not map to a single-byte execution character, is implementation-defined."

gcc and icl print ba on Windows 7. tcc prints a and drops the second letter altogether...




回答4:


The answer to your question can be found in your tags: Endianness. On a little endian machine the least significant byte is stored first. This is a convention and does not affect efficiency at all.

Of course, this means that you cannot simply cast it to a character string, since the order of characters is wrong, because there are no significant bytes in a character string, but just a sequence.

If you want to view the bytes within your variable, I suggest using a debugger that can read the actual bytes.



来源:https://stackoverflow.com/questions/7849574/endianness-why-do-chars-put-in-an-int16-print-backwards

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!