问题
i am new to DOM Document.. i have this html:
<tr class="calendar_row" data-eventid="39657">
<td class="alt1 eventDate smallfont" align="center">Sun<div class="eventday_multiple">Dec 9</div></td>
<td class="alt1 smallfont" align="center">3:34am</td>
<td class="alt1 smallfont" align="center">USD</td>
</tr>
<tr class="calendar_row" data-eventid="39658">
<td class="alt1 eventDate smallfont" align="center">Sun<div class="eventday_multiple">Dec 10</div></td>
<td class="alt1 smallfont" align="center">5:14am</td>
<td class="alt1 smallfont" align="center">EUR</td>
</tr>
i am trying to get first the contents inside the tr's using this code:
$ret = array();
libxml_use_internal_errors(true);
$doc = new DOMDocument();
$doc->loadHTML($html);
//$doc->saveHTMLFile('textbox.php');
$text = $doc->getElementsByTagName('tr');
foreach ($text as $tag){
$ret[] = $doc->saveHtml($tag);
echo $doc->saveHtml($tag);
}
i dont know why the value being echoed was the whole document and not the values inside the tr's..
second, i would like also to get the values in between those td tags like 5:14 AM,EUR,etc. but i dont have any idea how to do that.
Pardon for noob question..
Best Regards
回答1:
$doc = new DOMDocument();
$doc ->loadHTML("$html");
$tables = $doc->getElementsByTagName('table');
$table = $tables->item(0);//takes the first table in dom
foreach ($table->childNodes as $td) {
if ($td->nodeName == 'td') {
echo $td->nodeValue, "\n";
}
}
回答2:
Passing an element to saveHtml generates the elements outerHTML not its innerHTML, so you get its tag attributes and all its content. Of course you need to be running PHP>=5.3.6 .
The values between the td can be obtained by $td->firstChild->nodeValue;
or just $td->textContent;
where $td
is the <td>
in question.
来源:https://stackoverflow.com/questions/13908212/html-dom-document-parsing