问题
I am trying to get the summary of an article and download it as a string. This works great with some articles, but the wikipedia website is inconsistent. So NSScanner fails pretty often while it works fine for other articles.
Here's my NSScanner implementation:
NSString *separatorString = @"<table id=\"toc\" class=\"toc\">";
NSScanner *aScanner = nil;
NSString *container = nil;
NSString *muString = [NSString stringWithString:@"</table>"];
aScanner = [NSScanner scannerWithString:string];
[aScanner setScanLocation:0];
[aScanner scanUpToString:muString intoString:nil];
[aScanner scanString:muString intoString:nil];
[aScanner scanUpToString:separatorString intoString:&container];
How could this be improved? Or is there another way of getting this?
To visualize which bit of the article I want, here's an example:
http://en.wikipedia.org/wiki/Indigo
from this I'd want everything from "Indigo is the color on the electromagnetic spectrum" to "in English was in 1289".
Thanks!
回答1:
You could use WebKit's DOM API to walk the actual structure, rather than trying to parse the text blindly.
来源:https://stackoverflow.com/questions/3772414/getting-wikipedia-article-summary-using-nsscanner-problem