问题
I need to search through an HTML document for two specific strings of text in cocoa. I am creating an NSXMLDocument with the web page: Page Example Then I am trying to search it for the app title, and the url of the icon. I am currently using this code to search for the specific strings:
NSString *xpathQueryStringTitle = @"//div[@id='desktopContentBlockId']/div[@id='content']/div[@class='padder']/div[@id='title' @class='intro has-gcbadge']/h1";
NSString *xpathQueryStringIcon = @"//div[@id='desktopContentBlockId']/div[@id='content']/div[@class='padder']/div[@id='left-stack']/div[@class='lockup product application']/a";
NSArray *titleItemsNodes = [document nodesForXPath:xpathQueryStringTitle error:&error];
if (error)
{
[[NSAlert alertWithError:error] runModal];
return;
}
error = nil;
NSArray *iconItemsNodes = [document nodesForXPath:xpathQueryStringIcon error:&error];
if (error)
{
[[NSAlert alertWithError:error] runModal];
return;
}
When I try to search for these strings I get the error: "XQueryError:3 - "invalid token (@) - ./*/div[@id='desktopContentBlockId']/div[@id='content']/div[@class='padder']/div[@id='title' @class='intro has-gcbadge']/h1" at line:1"
I am loosely following this tutorial.
I also tried this without all of the @ symbols in the xPath, and it also returns an error. My syntax is obviously wrong for the xPath. What would the basic syntax be for this path. I've seen plenty of examples with a basic XML tree, but not html.
回答1:
I suspect it's that part near then end where you have a test for two attributes
/div[@id='title' @class='intro has-gcbadge']/h1";
Try changing it to:
/div[@id='title'][@class='intro has-gcbadge']/h1";
回答2:
OP's additional questions (from comments):
but I need to modify the returned strings. For the first string, i get
"<h1>App Title</h1>
, what would I add to get just the text inside the<h1>
?
Use:
/div[@id='title' and @class='intro has-gcbadge']/h1/text()
or use:
string(/div[@id='title' and @class='intro has-gcbadge']/h1)
On the second string, the i get the entire
<img width="111" src="link">
how would I return the value of link from thesrc
tag?
Use:
YorSecond-Not-Shown-Expression/@src
or use:
string(YorSecond-Not-Shown-Expression/@src)
来源:https://stackoverflow.com/questions/8055817/nsxmldocument-search-with-nodesforxpath