Unicode problems in C++ but not C

后端 未结 3 1874
无人共我
无人共我 2021-01-11 15:13

I\'m trying to write unicode strings to the screen in C++ on Windows. I changed my console font to Lucida Console and I set the output to CP_UTF8 a

3条回答
  •  野趣味
    野趣味 (楼主)
    2021-01-11 16:04

    If your file is encoded as UTF-8, you'll find the string length is 12. Run strlen from () on it to see what I mean. Setting the output code page will print the bytes exactly as you see them.

    What the compiler sees is equivalent to the following:

    const char text[] = "\xd0\xa0\xd0\xbe\xd1\x81\xd1\x81\xd0\xb8\xd1\x8f";
    

    Wrap it in a wide string (wchar_t in particular), and things aren't so nice.

    Why does C++ handle it differently? I haven't the slightest clue, except perhaps the mechanism used by the code underlying the C++ version is somewhat ignorant (e.g. std::cout happily outputs whatever you want blindly). Whatever the cause, apparently sticking to C is safest...which is actually unexpected to me considering the fact that Microsoft's own C compiler can't even compile C99 code.

    In any case, I'd advise against outputting to the Windows console if possible, Unicode or not. Files are so much more reliable, not to mention less of a hassle.

提交回复
热议问题