I been trying to extract site table text along with its link from the given table to (which is in site1.com) to my php page using a web crawler.
But unfortunately,
Chopping at html with string functions or regex is not a reliable method. DomDocument and Xpath do a nice job.
Code: (Demo)
$dom=new DOMDocument;
$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
foreach ($xpath->evaluate("//td[@class = 'FootNotes2']/a") as $node) { // target a tags that have as parent
$result[]=['href' => $node->getAttribute('href'), 'text' => $node->nodeValue]; // extract/store the href and text values
if (sizeof($result) == 10) { break; } // set a limit of 10 rows of data
}
if (isset($result)) {
echo "\n";
foreach ($result as $data) {
echo "\t- {$data['text']}
\n";
}
echo "
";
}
Sample Input:
$html = <<
Subject
Last Update
Replies
Views
Serious dedicated study partner for U World - step12013
02/11/17 01:50
10
318
some text - step12013
02/11/17 01:50
10
318
HTML;
Output:
- 热议问题