问题
I'm aware that there is already a standard method by prefixing with L
:
wchar_t *test_literal = L"Test";
The problem is that wchar_t
is not guaranteed to be 16-bits, but for my project, I need a 16-bit wchar_t
. I'd also like to avoid the requirement of passing -fshort-wchar
.
So, is there any prefix for C (not C++) that will allow me to declare a UTF-16 string literal?
回答1:
So, is there any prefix for C (not C++) that will allow me to declare a UTF-16 string literal?
Almost, but not quite. C2011 offers you these options:
- character string literals (elements of type
char
) - no prefix. Example:"Test"
- UTF-8 string literals (elements of type
char
) - 'u8' prefix. Example:u8"Test"
- wide string literals of three flavors:
wchar_t
elements - 'L' prefix. Example:L"Test"
char16_t
elements - 'u' prefix. Example:u"Test"
char32_t
elements - 'U' prefix. Example:U"Test"
Note well, however, that although you can declare a wide string literal having elements of type char16_t
, the standard does not guarantee that the UTF-16 encoding will be used for them, nor does it make any particular requirements on which characters outside the language's basic character set must be included in the execution character set. You can test the former at compile time, however: if char16_t
represents UTF-16-encoded characters in a given conforming implementation, then that implementation will define the macro __STDC_UTF_16__
to 1
.
Note also that you need to include (C's) uchar.h
header to use the char16_t
type name, but the u"..."
syntax for literals does not depend on that. Take care, as this header name collides with one used by the C interface of the International Components for Unicode, a relatively widely-used package for Unicode support.
Finally, be aware that much of this was new in C2011. To make use of it, you need a conforming C2011 implementation. Those are certainly available, but so are a lot of implementations that conform only to earlier standards, or even to none. Standard C99 and earlier do not provide a string literal syntax that guarantees 16-bit elements.
回答2:
You need a 16 bit wchar_t - but it's out of your control. If the compiler says it's 32 bit then it's 32 bit and it doesn't matter what you want or need.
The string classes are templated. You can always use a template to create a template class with 16 bit characters. I personally would try to remove any Unicode handling that is not UTF-8.
An alternative method is a clever #ifdef that will produce a compile time error if wchar_t is not 16 bit, and solve the problem when you actually need to solve it.
来源:https://stackoverflow.com/questions/50657874/how-do-you-safely-declare-a-16-bit-string-literal-in-c