Is there a faster way to decode html characters to a string than Html.fromHtml()?

后端 未结 6 1133
轻奢々
轻奢々 2020-12-02 17:24

I am using Html.fromHtml(STRING).toString() to convert a string that may or may not have html and/or html entities in it, to a plain text string.

This is pretty slow

相关标签:
6条回答
  • 2020-12-02 17:57

    Although I have not tried them yet, I found some possible solutions:

    1. HTML Java Parsers
    2. HTML Parsing
    3. More HTML Parsing

    I hope it helps.

    0 讨论(0)
  • 2020-12-02 18:01

    What about org.apache.commons.lang.StringEscapeUtils's unescapeHtml(). The library is available on Apache site.

    (EDIT: June 2019 - See the comments below for updates about the library)

    0 讨论(0)
  • 2020-12-02 18:08

    This is an incredibly fast and simple option: Unbescape

    It greatly improved our parsing performance which requires every string to be run through a decoder.

    0 讨论(0)
  • 2020-12-02 18:08

    Have you looked at Strip HTML from Text JavaScript

    0 讨论(0)
  • 2020-12-02 18:12

    With a large batch of these it can add over a minute

    Any parsing will take some time. 22ms seems to me like fast. Anyway, can you do it in background? Can help you some kind of caching?

    0 讨论(0)
  • 2020-12-02 18:18

    fromHtml() does not have a high-performance HTML parser, and I have no idea how quick the toString() implementation on SpannedString is. I doubt either were designed for your scenario.

    Ideally, the strings are clean before they get to a low-power phone. Either clean them up in the build process (for resources/assets), or clean them up on a server (before you download them).

    If, for whatever reason, you absolutely need to clean them up on the device, you can perhaps use the NDK to create a C/C++ library that does the cleaning for you faster.

    0 讨论(0)
提交回复
热议问题