HTML parsing in Android

前端 未结 2 1910
逝去的感伤
逝去的感伤 2021-01-15 23:21

I am trying to learn how to parse HTML, but as I don\'t have a lot of experience in either Java or Android, it\'s a little complicated. I have read the IBM XML parsing tutor

相关标签:
2条回答
  • 2021-01-15 23:51

    IMO there are two easy ways to parse HTML:

    • Convert the HML to XML (XHTML) using a library (e.g. HTMLTidy) and then use an XML parser
    • Use an existing HTML parser (e.g. a standard Web browser like WebKit, ForeFox, and/or IE) and then read the "DOM" which is a more-or-less-API-friendly representation of the parsed HTML

    Alternatively, if you want to write your own parser (which I doubt you should, for homework: it would be long and complicated to implement it properly/completely), see the specs for parsing HTML.

    0 讨论(0)
  • 2021-01-15 23:54

    Check out the following HTML parsers. There are more out there. Maybe one will work for you:

    • HTMLCleaner: http://htmlcleaner.sourceforge.net/

    • TagSoup: http://ccil.org/~cowan/XML/tagsoup/

    • Jericho: http://jericho.htmlparser.net/docs/index.html

    0 讨论(0)
提交回复
热议问题