问题
I am using google docs for web scraping. More specifically, I am using the Google Sheets built in IMPORTXML function in which I use XPath to select nodes to scrape data from.
What I am trying to do is basically check if a particular node exists, if YES, select some other random node.
/*IF THIS NODE EXISTS*/
if(exists(//table/tr/td[2]/a/img[@class='special'])){
/*SELECT THIS NODE*/
//table/tr/td[2]/a
}
回答1:
You don't have logic quite like that in XPath, but you might be able to do something like what you want.
If you want to select //table/tr/td[2]/a
but only if it has a img[@class='special']
in it, then you can use //table/tr/td[2]/a[img[@class='special']]
.
If you want to select some other node in some other circumstance, you could union two paths (the |
operator), and just make sure that each has a filter (within []
) that is mutually exclusive, like having one be a path and the other be not()
of that path. I'd give an example, but I'm not sure what "other random node" you'd want… Perhaps you could clarify?
The key thing is to think of XPath as a querying language, not a procedural one, so you need to be thinking of selectors and filters on them, which is a rather different way of thinking about problems than most programmers are used to. But the fact that the filters don't need to specifically be related to the selector (you can have a filter that starts looking at the root of the document, for instance) leads to some powerful (if hard-to-read) possibilities.
回答2:
Use:
/self::node()[//table/tr/td[2]/a/img[@class='special']]
//table/tr/td[2]/a
来源:https://stackoverflow.com/questions/13576379/xpath-simple-conditional-statement-if-node-x-exists-do-y