I have the following regexp:
/(?:[\\[\\{]*)(?:([A-G\\-][^A-G\\]\\}]*)+)(?:[\\]\\}]*)/
with the following expression:
{A\'\'
You are trying to match repeated capturing groups and get the captures. It is not possible with PHP PCRE regex.
What you can do is to make sure you either extract all {...}
/ [...]
substrings, trim them from the brackets and use a simple [A-G-][^A-G]*
regex, or add a \G
operator and make your regex unmaintainable but working as the original one.
Solution 1 is
/(?:[[{]*|(?!\A)\G)\K[A-G-][^A-G\]}]*/
See the regex demo. Note: this regex does not check for the closing ]
or }
, but it can be added with a positive lookahead.
(?:[[{]*|(?!\A)\G)
- matches a [
or {
, zero or more occurreces, or the end location of the previous successful match\K
- omits the text matched so far[A-G-]
- letters from A
to G
and a -
[^A-G\]}]*
- zero or more chars other than A
to G
and other than ]
and }
.See PHP demo.
Solution 2 is
$re = '/(?|{([^}]*)}|\[([^]]*)])/';
$str = "{A''BsCb}";
$res = array();
preg_match_all($re, $str, $m);
foreach ($m[1] as $match) {
preg_match_all('~[A-G-][^A-G]*~', $match, $tmp);
$res = array_merge($tmp, $res);
}
print_r($res);
See the PHP demo
The (?|{([^}]*)}|\[([^]]*)])
regex just matches strings like {...}
or [...]
(but not {...]
or [...}
) and captures the contents between brackets into Group 1 (since the branch reset group (?|...)
resets the group IDs in each branch). Then, all we need is to grab what we need with a more coherent '~[A-G-][^A-G]*~'
regex.
You already figured it out. Regarding to @sln's comment, there is no way to gather each singular match in one or different capturing groups while repeating a group in PCRE which is PHP's regex flavor. In this case only the last match is captured.
However if asserting that braces should be there at the start and end of string is not important and you only need those values there is less work to do:
$array = array_filter(preg_split("~(?=[A-G])~", trim("{A''BsCb}", '[{}]')));
Regex:
(?=[A-G]) # Positive lookahead to find next character be one from character class
This regex will match all similar positions to output correct data on split:
array(3) {
[1]=>
string(3) "A''"
[2]=>
string(2) "Bs"
[3]=>
string(2) "Cb"
}
Live demo
You can get it with this pattern with preg_match_all
at the item 0:
~
(?:
\G (?!\A) # contiguous to previous match, but not at the start of the string
|
{ (?=[^}]* }) # start with { and check if a closing bracket follows
|
\[ (?=[^]]* ]) # the same for square bracket
)
\K # start the match result here
[A-G] [^]A-G}]*
~xS
demo