HTML parsing in Android

血红的双手。 提交于 2019-12-30 11:31:48

问题


I am trying to learn how to parse HTML, but as I don't have a lot of experience in either Java or Android, it's a little complicated. I have read the IBM XML parsing tutorial and have learned to parse an RSS feed. My problem is: I would like to get data from an HTML site. I have read some information on HTML cleaner, JSON, etc., but I can't find a good tutorial to help me. Do you have any tutorials that might be helpful?

Thanks.


回答1:


Check out the following HTML parsers. There are more out there. Maybe one will work for you:

  • HTMLCleaner: http://htmlcleaner.sourceforge.net/

  • TagSoup: http://ccil.org/~cowan/XML/tagsoup/

  • Jericho: http://jericho.htmlparser.net/docs/index.html




回答2:


IMO there are two easy ways to parse HTML:

  • Convert the HML to XML (XHTML) using a library (e.g. HTMLTidy) and then use an XML parser
  • Use an existing HTML parser (e.g. a standard Web browser like WebKit, ForeFox, and/or IE) and then read the "DOM" which is a more-or-less-API-friendly representation of the parsed HTML

Alternatively, if you want to write your own parser (which I doubt you should, for homework: it would be long and complicated to implement it properly/completely), see the specs for parsing HTML.



来源:https://stackoverflow.com/questions/4831513/html-parsing-in-android

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!