Getting Wikipedia Article Summary using NSScanner Problem

时光怂恿深爱的人放手 提交于 2019-12-11 13:29:19

问题


I am trying to get the summary of an article and download it as a string. This works great with some articles, but the wikipedia website is inconsistent. So NSScanner fails pretty often while it works fine for other articles.

Here's my NSScanner implementation:

NSString *separatorString = @"<table id=\"toc\" class=\"toc\">";                                 
NSScanner *aScanner = nil;
NSString *container = nil;
NSString *muString = [NSString stringWithString:@"</table>"];

aScanner = [NSScanner scannerWithString:string];  
[aScanner setScanLocation:0];                                                   
[aScanner scanUpToString:muString intoString:nil];           
[aScanner scanString:muString intoString:nil];    

[aScanner scanUpToString:separatorString intoString:&container];

How could this be improved? Or is there another way of getting this?

To visualize which bit of the article I want, here's an example:

http://en.wikipedia.org/wiki/Indigo

from this I'd want everything from "Indigo is the color on the electromagnetic spectrum" to "in English was in 1289".

Thanks!


回答1:


You could use WebKit's DOM API to walk the actual structure, rather than trying to parse the text blindly.



来源:https://stackoverflow.com/questions/3772414/getting-wikipedia-article-summary-using-nsscanner-problem

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!