Get emoticon unicode from char UTF-16

早过忘川 提交于 2020-01-02 22:03:10

问题


I need to intercept an emoticon entry and change for my own emoticon. When I intercept an emoticon, for example, the FACE WITH MEDICAL MASK (\U+1F604), I get an UTF-16 char (0xD83D 0xDE04), Is it possible to convert this char value to the unicode value?

I need to convert 0xD83D 0xDE04 to \u1f604.

Thanks,


回答1:


I get an UTF-16 char (0xD83D 0xDE04), Is it possible to convert this char value to the unicode value?

For just a single code point in a string, you can convert it to an integer with:

int codepoint = "\uD83D\uDE04".codePointAt(0);  // 0x1F604

It is, however quite tedious to go over a whole string with codePointCount/codePointAt. Java/Dalvik's String type is strongly tied to UTF-16 code units and the codePoint methods are a poorly-integrated afterthought. If you are simply hoping to replace an emoji with some other string of characters, you are probably best off doing a plain string replace or regex with the two code units as they appear in the String type, eg text.replace("\uD83D\uDE04", ":-D").

(BTW Face with medical mask is U+1F637.)




回答2:


\u1f604 is the UTF-32 encoding of that emoticon. You can convert this way:

byte[] bytes = "\uD83D\uDE37".getBytes("UTF-32BE");


来源:https://stackoverflow.com/questions/20689645/get-emoticon-unicode-from-char-utf-16

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!