Regex is behaving lazy, should be greedy

后端 未结 3 1770
感动是毒
感动是毒 2021-02-20 15:12

I thought that by default my Regex would exhibit the greedy behavior that I want, but it is not in the following code:

 Regex keywords = new Reg         


        
相关标签:
3条回答
  • 2021-02-20 15:45

    It looks like you're trying to word break things. To do that you need the entire expression to be correct, your current one is not. Try this one instead..

    new Regex(@"\b(in|int|into|internal|interface)\b");
    

    The "\b" says to match word boundaries, and is a zero-width match. This is locale dependent behavior, but in general this means whitespace and punctuation. Being a zero width match it will not contain the character that caused the regex engine to detect the word boundary.

    0 讨论(0)
  • 2021-02-20 15:53

    Laziness and greediness applies to quantifiers only (?, *, +, {min,max}). Alternations always match in order and try the first possible match.

    0 讨论(0)
  • 2021-02-20 15:56

    According to RegularExpressions.info, regular expressions are eager. Therefore, when it goes through your piped expression, it stops on the first solid match.

    My recommendation would be to store all of your keywords in an array or list, then generate the sorted, piped expression when you need it. You would only have to do this once too as long as your keyword list doesn't change. Just store the generated expression in a singleton of some sort and return that on regex executions.

    0 讨论(0)
提交回复
热议问题