发表新帖

发表新帖

Convert a string containing ASCII to Unicode

后端未结

关注

 2  1415

I get a string from my HTML page into my Java HTTPServlet. On my request I get ASCII codes that display Chinese characters:

\"& #21487;& #20197;& #21578;&

相关标签:

2条回答

梦如初夏

2021-01-25 14:53

A Java String contains unicode characters. The decoding has taken place when the string was constructed.

0 讨论(0)
发布评论:

提交评论
- 加载中...
执笔经年

2021-01-25 14:55
There is no such thing as ASCII codes that display Chinese characters. ASCII does not represent Chinese characters.

If you already have a Java string, it already has an internal representation of all characters (US, LATIN, CHINESE). You can then encode that Java string into Unicode using UTF-8 or UTF-16 representations:

~~String s = "可以告诉我";~~ (EDIT: This line won't display correctly on systems not having fonts for Chinese characters)
```
String s = "\u53ef\u4ee5\u544a\u8bc9\u6211";
byte utfString = s.getBytes("UTF-8");
```
Now that I look at your updated question, you might be looking for the StringEscapeUtils class. It's from Apache Commons Text. And will unescape your HTML entities into a Java string:
```
String s = StringEscapeUtils.unescapeHtml("& #21487;& #20197;& #21578;& #35785;& #25105;"); // without spaces
```
0 讨论(0)
发布评论:

提交评论
- 加载中...

热议问题