问题
I am currently working a developer tracker for a game without using regex. I have hit a road block when trying to parse html at certain parts.
What I am trying to parse:
<td class="alt1" id="td_post_139718">
<!-- message, attachments, sig -->
<!-- icon and title -->
<div class="smallfont">
<img class="inlineimg" src="images/icons/icon1.gif" alt="Default" border="0" />
<strong>Re: TERA's E3 2010 Coverage</strong>
</div>
My Code:
$titleArray = array();
foreach($idArray as $id) {
$title = $dom->getElementById('td_post_'.$id);
$smallFont = $title->getElementsByTagName("div");
echo $smallFont->nodeValue;
}
It yields:
Notice: Undefined property: DOMNodeList::$nodeValue in C:\wamp\www\crawler\crawler.php on line 71
Notice: Undefined property: DOMNodeList::$nodeValue in C:\wamp\www\crawler\crawler.php on line 71
Notice: Undefined property: DOMNodeList::$nodeValue in C:\wamp\www\crawler\crawler.php on line 71
I am trying to find the text within a that is within a dynamic .
I've tried all sorts of combinations to try and get it to work but I've been able to achieve it.
回答1:
The ::getElementsByTagName
gives a node list. You have to iterate through it to retrieve the individual <div>
s. Example:
foreach ($title->getElementsByTagName("div") as $smallFont)) {
echo htmlspecialchars($smallFont->nodeValue), "<br />;
}
You can also use the textContent
property instead. See e.g. this discussion.
回答2:
getElementsByTagName returns a DOMNodeList, not a single node. You'll have to access the individual node from the list before trying to access nodeValue:
echo $smallFont->item(0)->nodeValue;
来源:https://stackoverflow.com/questions/3163057/php-dom-find-the-text-within-a-certain-div