I\'m stuck on a RegEx problem that\'s seemingly very simple and yet I can\'t get it working.
Suppose I have input like this:
Some text %interestingbi
Why do you have the extra set of parentheses?
Try this:
%interestingbit%.+?(?<OptionalCapture>OPTIONAL_THING)?.+?%anotherinterestingbit%
Or maybe this will work:
%interestingbit%.+?(?<OptionalCapture>OPTIONAL_THING|).+?%anotherinterestingbit%
In this example, the group captures OPTIONAL_THING, or nothing.
My thoughts are along similar lines to Niko's idea. However, I would suggest placing the 2nd .+? inside the optional group instead of the first, as follows:
%interestingbit%.+?(?:(?<optionalCapture>OPTIONAL_THING).+?)?%anotherinterestingbit%
This avoids unnecessary backtracking. If the first .+? is inside the optional group and OPTIONAL_THING does not exist in the search string, the regex won't know this until it gets to the end of the string. It will then need to backtrack, perhaps quite a bit, to match %anotherinterestingbit%, which as you said will always exist.
Also, since OPTIONAL_THING, when it exists, will always be before %anotherinterestingbit%, then the text after it is effectively optional as well and fits more naturally into the optional group.
Try this:
%interestingbit%(?:(.+)(?<optionalCapture>OPTIONAL_THING))?(.+?)%anotherinterestingbit%
First there's a non-capturing group which matches .+OPTIONAL_THING
or nothing. If a match is found, there's the named group inside, which captures OPTIONAL_THING
for you. The rest is captured with .+?%anotherinterestingbit%
.
[edit]: I added a couple of parentheses for additional capture groups, so now the captured groups match the following:
Are these the three matches you're looking for?