Grabbing the href attribute of an A element

前端 未结 10 2366
悲&欢浪女
悲&欢浪女 2020-11-21 05:06

Trying to find the links on a page.

my regex is:

/]*href=(\\\"\\\'??)([^\\\"\\\' >]*?)[^>]*>(.*)<\\/a>/
相关标签:
10条回答
  • 2020-11-21 05:50

    I agree with Gordon, you MUST use an HTML parser to parse HTML. But if you really want a regex you can try this one :

    /^<a.*?href=(["\'])(.*?)\1.*$/
    

    This matches <a at the begining of the string, followed by any number of any char (non greedy) .*? then href= followed by the link surrounded by either " or '

    $str = '<a title="this" href="that">what?</a>';
    preg_match('/^<a.*?href=(["\'])(.*?)\1.*$/', $str, $m);
    var_dump($m);
    

    Output:

    array(3) {
      [0]=>
      string(37) "<a title="this" href="that">what?</a>"
      [1]=>
      string(1) """
      [2]=>
      string(4) "that"
    }
    
    0 讨论(0)
  • 2020-11-21 05:52

    Quick test: <a\s+[^>]*href=(\"\'??)([^\1]+)(?:\1)>(.*)<\/a> seems to do the trick, with the 1st match being " or ', the second the 'href' value 'that', and the third the 'what?'.

    The reason I left the first match of "/' in there is that you can use it to backreference it later for the closing "/' so it's the same.

    See live example on: http://www.rubular.com/r/jsKyK2b6do

    0 讨论(0)
  • 2020-11-21 05:53

    why don't you just match

    "<a.*?href\s*=\s*['"](.*?)['"]"
    
    <?php
    
    $str = '<a title="this" href="that">what?</a>';
    
    $res = array();
    
    preg_match_all("/<a.*?href\s*=\s*['\"](.*?)['\"]/", $str, $res);
    
    var_dump($res);
    
    ?>
    

    then

    $ php test.php
    array(2) {
      [0]=>
      array(1) {
        [0]=>
        string(27) "<a title="this" href="that""
      }
      [1]=>
      array(1) {
        [0]=>
        string(4) "that"
      }
    }
    

    which works. I've just removed the first capture braces.

    0 讨论(0)
  • 2020-11-21 05:53

    Using your regex, I modified it a bit to suit your need.

    <a.*?href=("|')(.*?)("|').*?>(.*)<\/a>

    I personally suggest you use a HTML Parser

    EDIT: Tested

    0 讨论(0)
  • 2020-11-21 05:55

    I'm not sure what you're trying to do here, but if you're trying to validate the link then look at PHP's filter_var()

    If you really need to use a regular expression then check out this tool, it may help: http://regex.larsolavtorvik.com/

    0 讨论(0)
  • 2020-11-21 06:00

    The pattern you want to look for would be the link anchor pattern, like (something):

    $regex_pattern = "/<a href=\"(.*)\">(.*)<\/a>/";
    
    0 讨论(0)
提交回复
热议问题