Matching SRC attribute of IMG tag using preg_match

前端 未结 6 1635
走了就别回头了
走了就别回头了 2020-12-03 08:06

I\'m attempting to run preg_match to extract the SRC attribute from the first IMG tag in an article (in this case, stored in $row->introtext).

preg_match(\'/         


        
相关标签:
6条回答
  • 2020-12-03 08:39

    The regex I used was much simpler. My code assumes that the string being passed to it contains exactly one img tag with no other markup:

    $pattern = '/src="([^"]*)"/';
    

    See my answer here for more info: How to extract img src, title and alt from html using php?

    0 讨论(0)
  • 2020-12-03 08:40

    If you need to use preg_match() itself, try this:

     preg_match('/(?<!_)src=([\'"])?(.*?)\\1/',$content, $matches);
    
    0 讨论(0)
  • 2020-12-03 08:41

    Try:

    include ("htmlparser.inc"); // from: http://php-html.sourceforge.net/
    
    $html = 'bla <img src="images/stories/otakuzoku1.jpg" border="0" alt="Inside Otakuzoku\'s store" /> noise <img src="das" /> foo';
    
    $parser = new HtmlParser($html);
    
    while($parser->parse()) {
        if($parser->iNodeName == 'img') {
            echo $parser->iNodeAttributes['src'];
            break;
        }
    }
    

    which will produce:

    images/stories/otakuzoku1.jpg
    

    It should work with PHP 4.x.

    0 讨论(0)
  • 2020-12-03 08:47

    Here's a way to do it with built-in functions (php >= 4):

    $parser = xml_parser_create();
    xml_parse_into_struct($parser, $html, $values);
    foreach ($values as $key => $val) {
        if ($val['tag'] == 'IMG') {
            $first_src = $val['attributes']['SRC'];
            break;
        }
    }
    
    echo $first_src;  // images/stories/otakuzoku1.jpg
    
    0 讨论(0)
  • 2020-12-03 08:48

    This task should be executed by a dom parser because regex is dom-ignorant.

    Code: (Demo)

    $row = (object)['introtext' => '<div>test</div><img src="source1"><p>text</p><img src="source2"><br>'];
    
    $dom = new DOMDocument();
    $dom->loadHTML($row->introtext);
    echo $dom->getElementsByTagName('img')->item(0)->getAttribute('src');
    

    Output:

    source1
    

    This says:

    1. Parse the whole html string
    2. Isolate all of the img tags
    3. Isolate the first img tag
    4. Isolate its src attribute value

    Clean, appropriate, easy to read and manage.

    0 讨论(0)
  • 2020-12-03 08:52

    Your expression is incorrect. Try:

    preg_match('/< *img[^>]*src *= *["\']?([^"\']*)/i', $row->introtext, $matches);
    

    Note the removal of brackets around img and src and some other cleanups.

    0 讨论(0)
提交回复
热议问题