How to read/store unicode with STL strings and streams

前端 未结 2 612
日久生厌
日久生厌 2020-12-31 16:53

I need to modify my program to accept Unicode, which may come from any of UTF-8 and the various UTF-16 and UTF-32 encodings. I don\'t really know much about Unicode (though

相关标签:
2条回答
  • 2020-12-31 17:26

    UTF-8 conserves space, as long as you are primarily using the standard ASCII characters.

    std::string has no problem with UTF-8, as there is no 0 bytes in it. You can tell std::string how long the inputs chars are, if they have NULL bytes, as in UTF-32. std::string wouldn't be able to tell you how many characters your UTF-8 string is, you would have to use an external function.

    Also, there is a wide version of the std::string using wchar_t, as opposed to char, I just forget the name.

    Also there are facets in boost for transforming between encodings.

    You can either use the standard library with boost. Or you can use the string handling functions from the C library. There are also functions provided by programming frameworks such as Qt and Tcl.

    See for example:

    utf8 codecvt facet

    0 讨论(0)
  • 2020-12-31 17:29

    Have a look at the Switching from std::string to std::wstring for embedded applications? question

    As Pukku said: You might get some headache because of the fact that the C++ standard dictates that wide-streams are required to convert double-byte characters to single-byte when writing to a file, and how this conversion is done is implementation-dependent.

    0 讨论(0)
提交回复
热议问题