Regex PHP, Match all links with specific text

前端 未结 4 1685
野的像风
野的像风 2021-01-14 13:55

I am looking for a regular expression in PHP which would match the anchor with a specific text on it. E.g I would like to get anchors with text mylink like:

         


        
相关标签:
4条回答
  • 2021-01-14 14:19

    Try a parser instead:

    require_once "simple_html_dom.php";
    
    $data = 'Hi, I am looking for a regular expression in PHP which would match the anchor with a 
    specific text on it. E.g I would like to get anchors with text mylink like: 
    <a href="blabla" ... >mylink</a>
    
    So it should match all anchors but only if they contain specific text So it should match t
    hese string:
    
    <a href="blabla" ... >mylink</a>
    
    <a href="blabla" ... >blabla mylink</a>
    
    <a href="blabla" ... >mylink bla bla</a>
    
    <a href="blabla" ... >bla bla mylink bla bla</a>
    
    but not this one:
    
    <a href="blabla" ... >bla bla bla bla</a> Because this one does not contain word mylink.
    
    Also this one should not match: "mylink is string" because it is not an anchor.
    
    Anybody any Idea? Thanx Granit';
    
    $html = str_get_html($data);
    
    foreach($html->find('a') as $element) {
      if(strpos($element->innertext, 'mylink') === false) {
        echo 'Ignored: ' . $element->innertext . "\n";
      } else {
        echo 'Matched: ' . $element->innertext . "\n";
      }
    }
    

    which produces the output:

    Matched: mylink
    Matched: mylink
    Matched: blabla mylink
    Matched: mylink bla bla
    Matched: bla bla mylink bla bla
    Ignored: bla bla bla bla
    

    Download simple_html_dom.php from: http://simplehtmldom.sourceforge.net/

    0 讨论(0)
  • 2021-01-14 14:22

    This should work (build the regex string and insert whatever string you need instead of "mylink")

    <\s*a\s+[^>]*>[^<>]*mylink[^<>]*<\s*\/a\s*>
    

    But this is not recommended. You should use an HTML parser instead and process the tag. Regex is not really the right tool for this. (The above regex will not work if you have links that contain ">" although that might be rare)

    I presume php doesnt require any special escape characters if you just use the appropriate wrap around.

    Tested at regexpal.com

    A few notes::
    \s* - To match optional whitespace
    \s+ - To match atleast one space/tab and any extra optional whitespace
    [^>] - Matches any character except '>'
    [^<>]- Matches any character except '<' or '>'

    UPDATE: escaped the "/" for php matching with m/regex/

    0 讨论(0)
  • 2021-01-14 14:22
    if (preg_match('%<\s*a\s+href="blabla"[^>]*>(.*mylink.*)<\s*/a>%', $text, $regs)) {
        $result = $regs[1];
    } else {
        $result = "";
    }
    

    $regs[0] will hold the complete match $regs[1] will hold the bit inside the a tag

    0 讨论(0)
  • 2021-01-14 14:33
    /<a[^>]*>([^<]*mylink[^<]*)<\/a>/
    

    it's a bit simplistic, as it will break if tags are inside the link (<a href="/xyz">xyz <i>mylink</i> aaa</a>), but it should work.

    0 讨论(0)
提交回复
热议问题