问题
I am using some code to pick out all the <td>
tags from a HTML page:
$dom = new DOMDocument;
$dom->loadHTML($html);
foreach ($dom->getElementsByTagName('td') as $node) {
$array_data[ ] = $node->nodeValue;
}
This stores the data fine in my array.
The html data being looked at is:
<tr>
<td>DATA 1</td>
<td><a href="12345">DATA 2</a></td>
<td>DATA 3</td>
</tr>
The $array_data
returns:
Array([0])=>DATA 1 [1]=>DATA 2 [2]=> DATA 3)
My desired output is to get code out of the <a>
tag that is associated with the on the page. Desired output:
Array([0])=>DATA 1 [1]=>12345 [2]=>DATA 2 [3]=> DATA 3)
I think <a>
would be called child node, I am very new to working with DOM sorry if this seems a stupid question.
I have read SO link: Using PHP dom to get child elements
I've used this code to pick out the href:
foreach ($dom->getElementsByTagName('td') as $node) {
foreach ($node->getElementsByTagName('a') as $node){
$link = $node->getAttribute('href');
echo '<br>';
echo $link;
}
$array_data[ ] = $node->nodeValue;
}
Any help or pointers for other reading material would be greatly appreicated!
Thanks
回答1:
You should check td
has a
child. Select anchor tag using getElementsByTagName()
and check the selection has content using length property. If the td
has anchor in child, use getAttribute() to get href
attribute of it.
$dom = new DOMDocument;
$dom->loadHTML($html);
foreach ($dom->getElementsByTagName('td') as $node) {
$nodeAnchor = $node->getElementsByTagName("a");
if ($nodeAnchor->length)
$array_data[] = $nodeAnchor->item(0)->getAttribute("href");
$array_data[] = $node->nodeValue;
}
See demo
来源:https://stackoverflow.com/questions/43542965/php-dom-traverse-html-nodes-and-childnode