regex not matching due to repeated capturing group rather than capturing a repeated group

后端 未结 3 500
灰色年华
灰色年华 2021-01-14 22:31

I have the following regexp:

/(?:[\\[\\{]*)(?:([A-G\\-][^A-G\\]\\}]*)+)(?:[\\]\\}]*)/

with the following expression:

{A\'\'         


        
相关标签:
3条回答
  • 2021-01-14 23:19

    You are trying to match repeated capturing groups and get the captures. It is not possible with PHP PCRE regex.

    What you can do is to make sure you either extract all {...} / [...] substrings, trim them from the brackets and use a simple [A-G-][^A-G]* regex, or add a \G operator and make your regex unmaintainable but working as the original one.

    Solution 1 is

    /(?:[[{]*|(?!\A)\G)\K[A-G-][^A-G\]}]*/
    

    See the regex demo. Note: this regex does not check for the closing ] or }, but it can be added with a positive lookahead.

    • (?:[[{]*|(?!\A)\G) - matches a [ or {, zero or more occurreces, or the end location of the previous successful match
    • \K - omits the text matched so far
    • [A-G-] - letters from A to G and a -
    • [^A-G\]}]*- zero or more chars other than A to G and other than ] and }.

    See PHP demo.

    Solution 2 is

    $re = '/(?|{([^}]*)}|\[([^]]*)])/'; 
    $str = "{A''BsCb}"; 
    $res = array();
    preg_match_all($re, $str, $m);
    foreach ($m[1] as $match) {
        preg_match_all('~[A-G-][^A-G]*~', $match, $tmp);
        $res = array_merge($tmp, $res);
    }
    print_r($res);
    

    See the PHP demo

    The (?|{([^}]*)}|\[([^]]*)]) regex just matches strings like {...} or [...] (but not {...] or [...}) and captures the contents between brackets into Group 1 (since the branch reset group (?|...) resets the group IDs in each branch). Then, all we need is to grab what we need with a more coherent '~[A-G-][^A-G]*~' regex.

    0 讨论(0)
  • 2021-01-14 23:23

    You already figured it out. Regarding to @sln's comment, there is no way to gather each singular match in one or different capturing groups while repeating a group in PCRE which is PHP's regex flavor. In this case only the last match is captured.

    However if asserting that braces should be there at the start and end of string is not important and you only need those values there is less work to do:

    $array = array_filter(preg_split("~(?=[A-G])~", trim("{A''BsCb}", '[{}]')));
    

    Regex:

    (?=[A-G]) # Positive lookahead to find next character be one from character class
    

    This regex will match all similar positions to output correct data on split:

    array(3) {
      [1]=>
      string(3) "A''"
      [2]=>
      string(2) "Bs"
      [3]=>
      string(2) "Cb"
    }
    

    Live demo

    0 讨论(0)
  • 2021-01-14 23:35

    You can get it with this pattern with preg_match_all at the item 0:

    ~
    (?:
        \G (?!\A) # contiguous to previous match, but not at the start of the string
      |
        { (?=[^}]* }) # start with { and check if a closing bracket follows 
      |
        \[ (?=[^]]* ]) # the same for square bracket
    )
    \K # start the match result here
    [A-G] [^]A-G}]* 
    ~xS
    

    demo

    0 讨论(0)
提交回复
热议问题