NSXMLDocumentTidyHTML doesn't tidy some XHTML validation errors
问题 I want to grab text from a list of web pages. I've done a bit of experimenting and found that the best way for my needs is via WebKit. Once the source of the page has been grabbed, I want to strip out all the HTML tags, by using the technique in this comment. Here's my code: - (void)webView:(WebView *)sender didFinishLoadForFrame:(WebFrame *)frame { if(frame == [sender mainFrame]) { NSString *content = [[[[sender mainFrame] dataSource] representation] documentSource]; NSXMLDocument