Are UTF16 (as used by for example wide-winapi functions) characters always 2 byte long?

前端未结

关注

 8  1218

半阙折子戏 2021-02-09 06:23

Please clarify for me, how does UTF16 work? I am a little confused, considering these points:

There is a static type in C++, WCHAR, ~~which is 2 bytes long. (alway~~

8条回答

故里飘歌 (楼主)

2021-02-09 06:54

All characters in the Basic Multilingual Plane will be 2 bytes long.

Characters in other planes will be encoded into 4 bytes each, in the form of a surrogate pair.

Obviously, if a function does not try to detect surrogate pairs and blindly treats each pair of bytes as a character, it will bug out on strings that contain such pairs.

0 讨论(0)

查看其它8个回答

发布评论:

提交评论

加载中...

验证码

看不清?

提交回复