问题
I'm buiding an API that allows me to fetch strings in various encodings, including utf8, utf16, utf32 and wchar_t (that may be utf32 or utf16 according to OS).
New C++ standard had introduced new types
char16_t
andchar32_t
that do not have this sizeof ambiguity and should be used in future, so I would like to support them as well, but the question is, would they interfere with normaluint16_t
,uint32_t
,wchar_t
types not allowing overload because they may refer to same type?class some_class { public: void set(std::string); // utf8 string void set(std::wstring); // wchar string utf16 or utf32 according // to sizeof(wchar_t) void set(std::basic_string<uint16_t>) // wchar independent utf16 string void set(std::basic_string<uint32_t>); // wchar independent utf32 string #ifdef HAVE_NEW_UNICODE_CHARRECTERS void set(std::basic_string<char16_t>) // new standard utf16 string void set(std::basic_string<char32_t>); // new standard utf32 string #endif };
So I can just write:
foo.set(U"Some utf32 String"); foo.set(u"Some utf16 string");
What are the typedef of
std::basic_string<char16_t>
andstd::basic_string<char32_t>
as there is today:typedef basic_string<wchar_t> wstring.
I can't find any reference.
Edit: according to headers of gcc-4.4, that introduced these new types:
typedef basic_string<char16_t> u16string; typedef basic_string<char32_t> u32string;
I just want to make sure that this is actual standard requirement and not gcc-ism.
回答1:
1) char16_t
and char32_t
will be distinct new types, so overloading on them will be possible.
Quote from ISO/IEC JTC1 SC22 WG21 N2018:
Define
char16_t
to be a typedef to a distinct new type, with the name_Char16_t
that has the same size and representation asuint_least16_t
. Likewise, definechar32_t
to be a typedef to a distinct new type, with the name_Char32_t
that has the same size and representation asuint_least32_t
.
Further explanation (from a devx.com article "Prepare Yourself for the Unicode Revolution"):
You're probably wondering why the
_Char16_t
and_Char32_t
types and keywords are needed in the first place when the typedefsuint_least16_t
anduint_least32_t
are already available. The main problem that the new types solve is overloading. It's now possible to overload functions that take_Char16_t
and_Char32_t
arguments, and create specializations such asstd::basic_string<_Char16_t>
that are distinct fromstd::basic_string <wchar_t>
.
2) u16string
and u32string
are indeed part of C++0x and not just GCC'isms, as they are mentioned in various standard draft papers. They will be included in the new <string>
header. Quote from the same article:
The Standard Library will also provide
_Char16_t
and_Char32_t
typedefs, in analogy to the typedefswstring
,wcout
, etc., for the following standard classes:
filebuf, streambuf, streampos, streamoff, ios, istream, ostream, fstream, ifstream, ofstream, stringstream, istringstream, ostringstream,
string
来源:https://stackoverflow.com/questions/872491/new-unicode-characters-in-c0x