Parse anchor tags which have img tag as child element

爷,独闯天下 提交于 2019-12-03 20:15:43

Assuming $doc is a DOMDocument representing your HTML document:

$output = array();
$xpath = new DOMXPath($doc);
# find each img inside a link
foreach ($xpath->query('//a[@href]//img') as $img) {

    # find the link by going up til an <a> is found
    # since we only found <img>s inside an <a>, this should always succeed
    for ($link = $img; $link->tagName !== 'a'; $link = $link->parentNode);

    $output[] = array(
        'href' => $link->getAttribute('href'),
        'src'  => $img->getAttribute('src'),
        'alt'  => $img->getAttribute('alt'),
    );
}

Assuming your HTML is a valid XML document (has a single root node, etc), you can use SimpleXML like this:

$xml = simplexml_load_file($filename);
$items = array();
foreach ($xml->xpath('//a[@href]') as $anchor) {
    foreach ($anchor->xpath('.//img[@src][@alt]') as $img) {
        $items[] = array(
            'href' => (string) $anchor['href'],
            'src' => (string) $img['src'],
            'alt' => (string) $img['alt'],
        );
    }
}
print_r($items);

This uses xpath to search through the document for all <a> tags that have an href attribute. Then it searches under each <a> tag found to find any <img> tags that have both src and alt tags. It then just grabs the needed attributes and adds them to the array.

Use Simple HTML DOM Parser http://simplehtmldom.sourceforge.net/

You can do something like this (Rough Code, you will have to tune the code to get it to work. ):

 //include simple html dom parser
 $html = file_get_html('your html file here');

foreach($html->find('a') as $data){
   $output[]['href']=$data->href;
   $output[]['src']=$data->src;
   $output[]['alt']=$data->alt;

}
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!