问题
I am using domDocument hoping to parse this little html code. I am looking for a specific span
tag with a specific id
.
<span id="CPHCenter_lblOperandName">Hello world</span>
My code:
$dom = new domDocument;
@$dom->loadHTML($html); // the @ is to silence errors and misconfigures of HTML
$dom->preserveWhiteSpace = false;
$nodes = $dom->getElementsByTagName('//span[@id="CPHCenter_lblOperandName"');
foreach($nodes as $node){
echo $node->nodeValue;
}
But For some reason I think something is wrong with either the code or the html (how can I tell?):
- When I count nodes with
echo count($nodes);
the result is always 1 - I get nothing outputted in the nodes loop
- How can I learn the syntax of these complex queries?
- What did I do wrong?
回答1:
You can use simple getElementById:
$dom->getElementById('CPHCenter_lblOperandName')->nodeValue
or in selector way:
$selector = new DOMXPath($dom);
$list = $selector->query('/html/body//span[@id="CPHCenter_lblOperandName"]');
echo($list->item(0)->nodeValue);
//or
foreach($list as $span) {
$text = $span->nodeValue;
}
回答2:
Your four part question gets an answer in three parts:
- getElementsByTagName does not take an XPath expression, you need to give it a tag name;
- Nothing is output because no tag would ever match the tagname you provided (see #1);
- It looks like what you want is XPath, which means you need to create an XPath object - see the PHP docs for more;
Also, a better method of controlling the libxml errors is to use libxml_use_internal_errors(true) (rather than the '@' operator, which will also hide other, more legitimate errors). That would leave you with code that looks something like this:
<?php
libxml_use_internal_errors(true);
$dom = new DOMDocument();
$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
foreach($xpath->query("//span[@id='CPHCenter_lblOperandName']") as $node) {
echo $node->textContent;
}
来源:https://stackoverflow.com/questions/16093402/php-domdocument-how-to-get-that-content-of-this-tag