I was just reading a question about how to get data inside double curly braces (this question), and then someone brought up balancing groups. I\'m still not quite sure what
Just a small addition to M. Buettner's excellent answer:
(?)
syntax?(?
is subtly different from (?<-A>(?x))
. They result in the same control flow*, but they capture differently.
For example, let's look at a pattern for balanced braces:
(?:[^{}]|(?{)|(?<-B>}))+(?(B)(?!))
At the end of the match we do have a balanced string, but that is all we have - we don't know where the braces are because the B
stack is empty. The hard work the engine did for us is gone.
(example on Regex Storm)
(?
is the solution for that problem. How? It doesn't capture x
into $A
: it captures the content between the previous capture of B
and the current position.
Let's use it in our pattern:
(?:[^{}]|(?{)|(?}))+(?(Open)(?!))
This would capture into $Content
the strings between the braces (and their positions), for each pair along the way.
For the string {1 2 {3} {4 5 {6}} 7}
there'd be four captures: 3
, 6
,4 5 {6}
, and 1 2 {3} {4 5 {6}} 7
- much better than nothing or }
}
}
}
.
(example - click the table tab and look at ${Content}, captures)
In fact, it can be used without balancing at all: (?).(.(?
(a lookahead is more commonly used here but it doesn't always scale: it may duplicate your logic.)
(?
is a strong feature - it gives you exact control over your captures. Keep that in mind when you're trying to get more out of your pattern.