Regex to capture an optional group in the middle of a block of input

后端 未结 3 1968
梦如初夏
梦如初夏 2021-01-16 08:03

I\'m stuck on a RegEx problem that\'s seemingly very simple and yet I can\'t get it working.

Suppose I have input like this:

Some text %interestingbi         


        
相关标签:
3条回答
  • 2021-01-16 08:21

    Why do you have the extra set of parentheses?

    Try this:

    %interestingbit%.+?(?<OptionalCapture>OPTIONAL_THING)?.+?%anotherinterestingbit%
    

    Or maybe this will work:

    %interestingbit%.+?(?<OptionalCapture>OPTIONAL_THING|).+?%anotherinterestingbit%
    

    In this example, the group captures OPTIONAL_THING, or nothing.

    0 讨论(0)
  • 2021-01-16 08:23

    My thoughts are along similar lines to Niko's idea. However, I would suggest placing the 2nd .+? inside the optional group instead of the first, as follows:

    %interestingbit%.+?(?:(?<optionalCapture>OPTIONAL_THING).+?)?%anotherinterestingbit%
    

    This avoids unnecessary backtracking. If the first .+? is inside the optional group and OPTIONAL_THING does not exist in the search string, the regex won't know this until it gets to the end of the string. It will then need to backtrack, perhaps quite a bit, to match %anotherinterestingbit%, which as you said will always exist.

    Also, since OPTIONAL_THING, when it exists, will always be before %anotherinterestingbit%, then the text after it is effectively optional as well and fits more naturally into the optional group.

    0 讨论(0)
  • 2021-01-16 08:33

    Try this:

    %interestingbit%(?:(.+)(?<optionalCapture>OPTIONAL_THING))?(.+?)%anotherinterestingbit%
    

    First there's a non-capturing group which matches .+OPTIONAL_THING or nothing. If a match is found, there's the named group inside, which captures OPTIONAL_THING for you. The rest is captured with .+?%anotherinterestingbit%.

    [edit]: I added a couple of parentheses for additional capture groups, so now the captured groups match the following:

    • $1 : text before OPTIONAL_THING or nothing
    • $2 or $optionalCapture : OPTIONAL_THING or nothing
    • $3 : text after OPTIONAL_THING, or if OPTIONAL_THING is not found, the full text between %interestingbit% and %anotherinterestingbit%

    Are these the three matches you're looking for?

    0 讨论(0)
提交回复
热议问题