PHP DOMDocument how to get that content of this tag?

问题

I am using domDocument hoping to parse this little html code. I am looking for a specific span tag with a specific id.

<span id="CPHCenter_lblOperandName">Hello world</span>

My code:

$dom = new domDocument;
@$dom->loadHTML($html); // the @ is to silence errors and misconfigures of HTML
$dom->preserveWhiteSpace = false;
$nodes = $dom->getElementsByTagName('//span[@id="CPHCenter_lblOperandName"');

foreach($nodes as $node){
    echo $node->nodeValue;
}

But For some reason I think something is wrong with either the code or the html (how can I tell?):

When I count nodes with echo count($nodes); the result is always 1
I get nothing outputted in the nodes loop
How can I learn the syntax of these complex queries?
What did I do wrong?

回答1:

You can use simple getElementById:

$dom->getElementById('CPHCenter_lblOperandName')->nodeValue

or in selector way:

$selector = new DOMXPath($dom);

$list = $selector->query('/html/body//span[@id="CPHCenter_lblOperandName"]');

echo($list->item(0)->nodeValue);

//or 
foreach($list as $span) { 
    $text = $span->nodeValue;
}

回答2:

Your four part question gets an answer in three parts:

getElementsByTagName does not take an XPath expression, you need to give it a tag name;
Nothing is output because no tag would ever match the tagname you provided (see #1);
It looks like what you want is XPath, which means you need to create an XPath object - see the PHP docs for more;

Also, a better method of controlling the libxml errors is to use libxml_use_internal_errors(true) (rather than the '@' operator, which will also hide other, more legitimate errors). That would leave you with code that looks something like this:

<?php    
libxml_use_internal_errors(true);
$dom = new DOMDocument();
$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
foreach($xpath->query("//span[@id='CPHCenter_lblOperandName']") as $node) {
    echo $node->textContent;
}

来源：https://stackoverflow.com/questions/16093402/php-domdocument-how-to-get-that-content-of-this-tag

标签

php

html

parsing

domdocument