Convert between std::u8string and std::string

前端 未结 1 1024
别跟我提以往
别跟我提以往 2021-02-12 10:32

C++20 added char8_t and std::u8string for UTF-8. However, there is no UTF-8 version of std::cout and OS APIs mostly expect char

1条回答
  •  醉话见心
    2021-02-12 10:33

    UTF-8 "support" in C++20 seems to be a bad joke.

    The only UTF functionality in the STL is support for strings and string_views (std::u8string, std::u8string_view, std::u16string, ...). That is all. There is no STL support for UTF coding in regular expressions, formatting, file i/o and so on.

    In C++17 you can--at least--easily treat any UTF-8 data as 'char' data, which makes usage of std::regex, std::fstream, std::cout, etc. possible without loss of performance.

    In C++20 things will change. You cannot longer write for example std::string text = u8"..."; It will be impossible to write something like

    std::u8fstream file; std::u8string line; ... file << line;
    

    since there is no std::u8fstream.

    Even the new C++20 std::format does not support UTF at all, because all necessary overloads are simply missing. You cannot write

    std::u8string text = std::format(u8"...{}...", 42);
    

    To make matters worse, there is no simple casting (or conversion) between std::string and std::u8string (or even between const char* and const char8_t*). So if you want to format (using std::format) or input/output (std::cin, std::cout, std::fstream, ...) UTF-8 data, you have to internally copy all strings. - That will be an unnecessary performance killer.

    Finally, what use will UTF have without input, output, and formatting?

    0 讨论(0)
提交回复
热议问题