Is there a faster way to decode html characters to a string than Html.fromHtml()?

后端未结

关注

 6  1133

I am using Html.fromHtml(STRING).toString() to convert a string that may or may not have html and/or html entities in it, to a plain text string.

This is pretty slow

相关标签:

6条回答

自闭症患者

2020-12-02 17:57
Although I have not tried them yet, I found some possible solutions:
1. HTML Java Parsers
2. HTML Parsing
3. More HTML Parsing
I hope it helps.
0 讨论(0)
发布评论:

提交评论
- 加载中...
闹比i

2020-12-02 18:01

What about org.apache.commons.lang.StringEscapeUtils's unescapeHtml(). The library is available on Apache site.

(EDIT: June 2019 - See the comments below for updates about the library)

0 讨论(0)
发布评论:

提交评论
- 加载中...
被撕碎了的回忆

2020-12-02 18:08

This is an incredibly fast and simple option: Unbescape

It greatly improved our parsing performance which requires every string to be run through a decoder.

0 讨论(0)
发布评论:

提交评论
- 加载中...
难免孤独

2020-12-02 18:08

Have you looked at Strip HTML from Text JavaScript

0 讨论(0)
发布评论:

提交评论
- 加载中...
情歌与酒

2020-12-02 18:12

With a large batch of these it can add over a minute

Any parsing will take some time. 22ms seems to me like fast. Anyway, can you do it in background? Can help you some kind of caching?

0 讨论(0)
发布评论:

提交评论
- 加载中...
无人共我

2020-12-02 18:18

fromHtml() does not have a high-performance HTML parser, and I have no idea how quick the toString() implementation on SpannedString is. I doubt either were designed for your scenario.

Ideally, the strings are clean before they get to a low-power phone. Either clean them up in the build process (for resources/assets), or clean them up on a server (before you download them).

If, for whatever reason, you absolutely need to clean them up on the device, you can perhaps use the NDK to create a C/C++ library that does the cleaning for you faster.

0 讨论(0)
发布评论:

提交评论
- 加载中...