scraping all images from a website using DOMDocument

大兔子大兔子 提交于 2019-12-01 07:55:44

问题


I basically want to get ALL the images in any website using DOMDocument. but then i cant even load my html due to some reasons I dont know yet.

$url="http://<any_url_here>/";
$dom = new DOMDocument();
@$dom->loadHTML($url); //i have also tried removing @
$dom->preserveWhiteSpace = false;
$dom->saveHTML();
$images = $dom->getElementsByTagName('img');
foreach ($images as $image) 
{
echo $image->getAttribute('src');
}

what happens is nothing gets printed . or did I do something wrong with the code?


回答1:


You don't get a result because $dom->loadHTML() expects html. You give it an url, you first need to get the html of the page you want to parse. You can use file_get_contents() for that.

I used this in my image grab class. Works fine for me.

$html = file_get_contents('http://www.google.com/');
$dom = new domDocument;
$dom->loadHTML($html);
$dom->preserveWhiteSpace = false;
$images = $dom->getElementsByTagName('img');
foreach ($images as $image) {
  echo $image->getAttribute('src');
}


来源:https://stackoverflow.com/questions/15895773/scraping-all-images-from-a-website-using-domdocument

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!