PHP DOM traverse HTML nodes and childnode

泪湿孤枕 提交于 2020-01-03 01:40:11

问题


I am using some code to pick out all the <td> tags from a HTML page:

$dom = new DOMDocument;
$dom->loadHTML($html);
foreach ($dom->getElementsByTagName('td') as $node) {
$array_data[ ] = $node->nodeValue;
}

This stores the data fine in my array.

The html data being looked at is:

<tr>
<td>DATA 1</td>
<td><a href="12345">DATA 2</a></td>
<td>DATA 3</td> 
</tr>

The $array_data returns:

Array([0])=>DATA 1 [1]=>DATA 2 [2]=> DATA 3)

My desired output is to get code out of the <a> tag that is associated with the on the page. Desired output:

Array([0])=>DATA 1 [1]=>12345 [2]=>DATA 2 [3]=> DATA 3)

I think <a> would be called child node, I am very new to working with DOM sorry if this seems a stupid question.

I have read SO link: Using PHP dom to get child elements

I've used this code to pick out the href:

   foreach ($dom->getElementsByTagName('td') as $node) {
      foreach ($node->getElementsByTagName('a') as $node){
      $link = $node->getAttribute('href');
      echo '<br>';
      echo $link;
      }
      $array_data[ ] = $node->nodeValue;
   }

Any help or pointers for other reading material would be greatly appreicated!
Thanks


回答1:


You should check td has a child. Select anchor tag using getElementsByTagName() and check the selection has content using length property. If the td has anchor in child, use getAttribute() to get href attribute of it.

$dom = new DOMDocument;
$dom->loadHTML($html);
foreach ($dom->getElementsByTagName('td') as $node) {
    $nodeAnchor = $node->getElementsByTagName("a");
    if ($nodeAnchor->length)
        $array_data[] = $nodeAnchor->item(0)->getAttribute("href");
    $array_data[] = $node->nodeValue;
}

See demo



来源:https://stackoverflow.com/questions/43542965/php-dom-traverse-html-nodes-and-childnode

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!