Are `char16_t` and `char32_t` misnomers?

前端 未结 3 1669
慢半拍i
慢半拍i 2021-02-07 20:50

NB: I\'m sure someone will call this subjective, but I reckon it\'s fairly tangible.

C++11 gives us new basic_string types

相关标签:
3条回答
  • 2021-02-07 20:55

    They are not fundamentally flawed, by definition - they are part of the standard. If that offends your sensibilities then you must find a way to deal with it. The time to make this argument was before the latest standard was ratified, and that time has long passed.

    0 讨论(0)
  • 2021-02-07 20:58

    are these names fundamentally flawed?

    (I think most of this question has been answered in the comments, but to make an answer) No, not at all. char16_t and char32_t were created for a specific purpose. To have data type support for all Unicode encoding formats (UTF-8 is covered by char) while keeping them as generic as possible to not limit them to only Unicode. Whether they are unsigned or have a fixed-width is not directly related to what they are: character data types. Types which hold and represent characters. Signedness is a property of data types that represent numbers not characters. The types are meant to store characters, either a 16 bit or 32 bit based character data, nothing more or less.

    0 讨论(0)
  • 2021-02-07 21:07

    The naming convention to which you refer (uint32_t, int_fast32_t, etc.) is actually only used for typedefs, and not for primitive types. The primitive integer types are {signed, unsigned} {char, short, int, long, long long}, {as opposed to float or decimal types} ...

    However, in addition to those integer types, there are four distinct, unique, fundamental types, char, wchar_t, char16_t and char32_t, which are the types of the respective literals '', L'', u'' and U'' and are used for alpha-numeric type data, and similarly for arrays of those. Those types are of course also integer types, and thus they will have the same layout at some of the arithmetic integer types, but the language makes a very clear distinction between the former, arithmetic types (which you would use for computations) and the latter "character" types which form the basic unit of some type of I/O data.

    (I've previously rambled about those new types here and here.)

    So, I think that char16_t and char32_t are actually very aptly named to reflect the fact that they belong to the "char" family of integer types.

    0 讨论(0)
提交回复
热议问题