How to write 3 bytes unicode literal in Java?

旧街凉风 提交于 2019-11-28 14:17:51

Because Java went full-out unicode when people thought 64K are enough for everyone (Where did one hear such before?), they started out with UCS-2 and later upgraded to UTF-16.

But they never bothered to add an escape sequence for unicode characters outside the BMP.

Thus, your only recourse is manually recoding to a UTF-16 surrogate-pair and using two UTF-16 escapes.

Your example codepoint U+10428 is "\uD801\uDC28".

I used this site for the recoding: http://rishida.net/tools/conversion/

Quote from the docs:

3.10.5 String Literals

A string literal consists of zero or more characters enclosed in double quotes. Characters may be represented by escape sequences (§3.10.6) - one escape sequence for characters in the range U+0000 to U+FFFF, two escape sequences for the UTF-16 surrogate code units of characters in the range U+010000 to U +10FFFF.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!