问题
I am writing a little scraper script that will find the image URL that has a particular class name. I know that my cURL and DOMDocument is functioning okay, and even the DomXPath really (as far as I can tell, there are no errors) But I am struggling to work out how to get the URL of the xpath query results.
My code so far:
$dom = new DOMDocument();
@$dom->loadHTML($x);
$xpath = new DomXpath($dom);
$div = $xpath->query('//*[@class="productImage"]');
var_dump($div);
echo $div->item(0);
If I var_dump($x) the page outputs no problem. So the CURL is working fine. But I do not know how to get the data that is contained in the $div. I am trying to find an Image with a class of 'productImage' which looks like:
<img src="/uploads/5W/yP/5WyPP4l7Z-jmZRzu_MJ6zg/1077-d.jpg" border="1" alt="Album" class="productImage">
I want the source of that image tag.
Any suggestions?
回答1:
$dom = new DOMDocument();
$dom->loadHTML($x);
$xpath = new DomXpath($dom);
$imgs = $xpath->query('//*[@class="productImage"]');
foreach($imgs as $img)
{
echo 'ImgSrc: ' . $img->getAttribute('src') .'<br />' . PHP_EOL;
}
Try that...
== EDIT: Additional Info ==
The reason I use a loop here is because you may find more than one img. If you know there is only one element (or you want the first dom node found) you can access the elelement from the domnodelist via the item method of domnodelist - like so:
$dom = new DOMDocument();
$dom->loadHTML($x);
$xpath = new DomXpath($dom);
$img = $xpath->query('//*[@class="productImage"]');
echo 'ImgSrc: ' . $img->item(0)->getAttribute('src') .'<br />' . PHP_EOL;
回答2:
You don't actually need to use XPath here, because it seems that you're just after images and that can be done by using DOMDocument::getElementsByTagName(), followed by a simple filter:
foreach ($dom->getElementsByTagName('img') as $image) {
$class = $image->getAttribute('class');
if (strpos(" $class ", " productImage ") !== false) {
$url = $image->getAttribute('src');
// do stuff
}
}
Then, you can get the src
attribute by using DOMElement::getAttribute():
echo $image->getAttribute('src');
来源:https://stackoverflow.com/questions/16054856/domxpath-with-domdocument-to-get-img-class-url