getting src element using domDocument

强颜欢笑 提交于 2020-01-02 20:10:45

问题


I am using domDocument. I am close but need help for the last little bit

I have this html just a snippet below. There are a number of rows. I am trying to get the href.

so far i am doing the following: I can get the table, tr, and td ok , but not sure what to do from there.

Thanks for any help

foreach ($dom->getElementsByTagName('table') as $tableitem) {
    if ( $tableitem->getAttribute('class') == 'tableStyle02'){
        $rows = $tableitem->getElementsByTagName('tr');
        foreach ($rows as $row){ 
            $cols = $row->getElementsByTagName('td'); 

            $hrefs = $cols->item(0)->getElementsByTagName('a'); 
        }     
    }
}

html snippet:

<table width="100%" border="0" cellspacing="0" cellpadding="2" class="tableStyle02"> 
    <tr> 
        <td><span class="Name"><a href="bin.php?cid=703&size=0">
               <strong>Conference Facility</strong></a></span></td>
        <td align="center" nowrap>0.00</td>
        <td align="center">&nbsp;0&nbsp;</td>
        <td align="center">&nbsp;&nbsp;</td>
        <td align="center">&nbsp;0&nbsp;</td>
        <td align="center">&nbsp;0&nbsp;</td>
        <td align="center">&nbsp;0 - 0 &nbsp;</td>
        <td align="center">&nbsp;Wired Internet,&nbsp;&nbsp;&nbsp;</td>
        <td align="center">&nbsp;&nbsp;</td>
    </tr>

回答1:


Let me introduce you the concept of xpath, a query language for DomDocuments:

//table[@class="tableStyle02"]//a/@href

Reads as: Take the table tag with class attribute tableStyle02 and then the href attribute from within the a child tag.

Or as you had the foreach for tr and td elements as well:

//table[@class="tableStyle02"]/tr/td/a/@href

Now in that path, the a tag is a direct children of the td tag which is a direct children of the tr tag which is a direct children of the table tag. As you can see, with xpath it is much easier to formulate the path to the element than writing everything in PHP code.

Apropos PHP code, in PHP this can look like:

$doc = new DOMDocument();
$doc->loadHTML($html);
$xp = new DOMXPath($doc);
$href = $xp->evaluate('string(//table[@class="tableStyle02"]//a/@href)');

The variable $href then contains the string: bin.php?cid=703&size=0.


This example is with a string (string(...)), so ->evaluate returns a string, which is created from the first found attribute node. Instead you can return a nodelist as well:

$hrefs = $xp->query('//table[@class="tableStyle02"]/tr/td/span/a/@href');
#             ^^^^^                                       ^^^^

Now $hrefs contains the usual DOMNodeList, here it contains all the href attribute nodes:

echo $hrefs->item(0)->nodeValue; # bin.php?cid=703&size=0

Take care that if you use only one slash / to separate tags, that they need to be direct children. With two slashes // it can be a descendant (child or child of child (of child (of ...))).




回答2:


You should be able to use getAttribute() on the individual DOMElement instances, (just as you used it the second line of the example):

foreach ($hrefs as $a_node) {
    if ($a_node->hasAttribute('href')) {
        print $a_node->getAttribute('href');
    }
}



回答3:


You don't have to navigate your way down the DOM hierarchy to use getElementsByTagName:

foreach ($dom->getElementsByTagName('table') as $tableitem) {
    if ($tableitem->getAttribute('class') == 'tableStyle02'){
        $links = $tableitem->getElementsByTagName("a");
    }
}

$links at this point is now a DOMNodeList, so you can iterate through it:

foreach ($dom->getElementsByTagName('table') as $tableitem) {
    if ($tableitem->getAttribute('class') == 'tableStyle02'){
        $links = $tableitem->getElementsByTagName("a");
        $hrefs = array();
        foreach ($links as $link) {
            $hrefs[] = $link->getAttribute("href");
        }
    }
}
// Do things with $hrefs


来源:https://stackoverflow.com/questions/11593704/getting-src-element-using-domdocument

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!