Web scraping in Objective C

后端 未结 2 2040
误落风尘
误落风尘 2021-02-15 16:22

is there any Objective C library for parsing HTML, like python\'s BeautifulSoup? Thanks

相关标签:
2条回答
  • 2021-02-15 16:27

    From Apple's part there is NSXMLDocument and NSXMLParser, which support tidied HTML input. (Tree-Based XML Programming Guide)

    On iOS (4.3) there's currently no NSXMLDocument available, so you'd have to use either NSXMLParser or libxml2.2.

    Some more informations on potential problems with parsing malformed HTML:
    What's the best approach for parsing XML/'screen scraping' in iOS? UIWebview or NSXMLParser?

    The most reliable solution is to use an off-screen WebView, load the HTML source into it and then access its DOM tree.

    0 讨论(0)
  • 2021-02-15 16:30

    The best way I have found is NSXMLParser + libtidy. However, there are many third party libraries are available now which makes parsing easier. (last answer was written in 2011).

    • Google's Gumbo HTML5 parser is pretty good. It's written in pure C99 and you can use it with Objective C (use a wrapper like this one).
    • If you want pure Objective C libraries then Ono or hpple are good. HTMLReader is also a good alternative.
    • If Swift is your thing, you could use NDHpple which is a swift wrapper based on hpple. Or You could use Swift-HTML-Parser. (Bonus: Alamofire is as good as Python Requests and is a joy to use)
    0 讨论(0)
提交回复
热议问题