How to best deal with Windows' 16-bit wchar_t ugliness?

后端 未结 2 343
暗喜
暗喜 2020-12-20 01:37

I\'m writing a wrapper layer to be used with mingw which provides the application with a virtual UTF-8 environment. Functions which deal with filenames are wrappers which co

相关标签:
2条回答
  • 2020-12-20 01:54

    I'd do something like #4, but don't generate any output until you're sure the input is valid.

    • mbrtowc should decode the entire character. If it's outside the BMP, then output the high surrogate and store the low surrogate in the mbstate_t.
    • wcrtomb should store high surrogates in the mbstate_t, then output all 4 UTF-8 bytes if the character is valid.
    0 讨论(0)
  • 2020-12-20 02:04

    If you are on windows, you convert between UTF-16 and UTF-8 a whole string at a time using MultiByteToWideChar and WideCharToMultiByte.

    While the default mode in GCC is a 32bit wchar_t there are compile switches that change that, and more generally the c & c++ specs don't specify the size of wchar_t - in fact wchar_t can be the same size as char.

    If you want to avoid using Windows APIs (in your windows wrapper code!?) then use mbstowcs to convert an entire string at a time.

    0 讨论(0)
提交回复
热议问题