C++20 added char8_t
and std::u8string
for UTF-8. However, there is no UTF-8 version of std::cout
and OS APIs mostly expect char
At present, std::c8rtomb
and std::mbrtoc8
are the the only interfaces provided by the standard that enable conversion between the execution encoding and UTF-8. The interfaces are awkward. They were designed to match pre-existing interfaces like std::c16rtomb
and std::mbrtoc16
. The wording added to the C++ standard for these new interfaces intentionally matches the wording in the C standard for the pre-existing related functions (hopefully these new functions will eventually be added to C; I still need to pursue that). The intent in matching the C standard wording, as confusing as it is, is to ensure that anyone familiar with the C wording recognizes that the char8_t
interfaces work the same way.
cppreference.com has some examples for the UTF-16 versions of these functions that should be useful for understanding the char8_t
variants.