Handle UTF-8 string

前端 未结 5 1470
慢半拍i
慢半拍i 2021-01-13 06:30

as I know linux uses UTF-8 encoding. This means I can use std::string for handling string right? Just the encoding will be UTF-8.

Now on UTF-8 we know s

5条回答
  •  一向
    一向 (楼主)
    2021-01-13 06:46

    You cannot handle UTF-8 with std::string. string, despite its name, is only a container for (multi-) bytes. It is not a type for text storage (beyond the fact that a byte buffer can obviously store any object, including text). It doesn’t even store characters (char is a byte, not a character).

    You need to venture outside the standard library if you want to actually handle (rather than just store) Unicode characters. Traditionally, this is done by libraries such as ICU.

    However, while this is a mature library, its C++ interface sucks. A modern approach is taken in Ogonek. It’s not as well established and still work in progress, but provides a much nicer interface.

提交回复
热议问题