How to convert UTF16 (emoji) to HTML Entity (hex) using java

人盡茶涼 提交于 2019-12-03 09:06:05

You can use emoji4j library for this.

For example:

String line = "Hi , i am fine \uD83D\uDE02 \uD83D\uDE02, how r u ?";

EmojiUtils.hexHtmlify(line); //Hi , i am fine 😂 😂, how r u ?

Although the string appears to contain two Unicode characters, it's already one character encoded in UTF-16, that's how Java strings work. You can determine the actual UTF-16-decoded character code using the String.codePointAt method. Here the character's code is 0x1F602, which is Unicode 'FACE WITH TEARS OF JOY': 😂

To write the character to HTML:

OPTION 1: generate the HTML escape entity

String str="\uD83D\uDE02";
FileWriter w=new FileWriter("c:\\temp\\emoji.html");
w.write("<html><body>");
w.write("&#x"+Long.toHexString(str.codePointAt(0))+";");
w.write("</body></html>");
w.close();

This yields

<html><body>&#x1f602;</body></html>

OPTION 2: use some Unicode-capable HTML encoding such as UTF-8

String str="\uD83D\uDE02";
OutputStreamWriter w=new OutputStreamWriter(new FileOutputStream("c:\\temp\\emoji.html"),"UTF-8");
w.write("<html>\n<head><meta http-equiv=\"content-type\" content=\"text/html; charset=utf-8\"></head>\n<body>");
w.write(str);
w.write("</body></html>");
w.close();

This yields

<html>
<head><meta http-equiv="content-type" content="text/html; charset=utf-8"></head>
<body>рџ‚</body></html>

which is the same happy face encoded in UTF-8.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!