C# regex for matching sepcific text inside nested parentheses

我的梦境 提交于 2021-02-04 20:52:14

问题


I have these code lines for take to operators between parentheses:

string filtered = Regex.Replace(input, "\\(.*?\\)", string.Empty);
var result = filtered.Split(new[] { ' ' }, 
            StringSplitOptions.RemoveEmptyEntries)
            .Where(element => element == "OR" || element == "AND");    
string temp = string.Join(" ", result);

These lines do not work for nested parentheses.

For example; it is working for this input :

X1 OR ( X2 AND X3 AND X4 AND X5 ) OR X6

It give me this result: OR OR

But, when my input has more than one nested parentheses, it works wrongly.

For this input:

X1 OR ( X2 AND( X3 AND X4 ) AND X5 ) OR X6

I want to take for result OR OR but it prints OR AND OR.

Although there are two ( characters in string, when it ends processing after matching the first ) character.

How can I adjust my regex pattern?


回答1:


Your \(.*?\) regex contains 3 parts: 1) \( matching a literal (, 2) .*? lazy dot matching pattern (that matches 0+ any characters other than a newline, as few as possible, up to the first ), and 3) a \) matching a literal ).

Use balancing construct if your strings cannot have escaped sequences:

@"\((?>[^()]|(?<o>)\(|(?<-o>)\))*\)(?(o)(?!))"

The point here is that the expression should not be enclosed with any anchors (as in What are regular expression Balancing Groups).

Details:

  • \( - a literal (
  • (?> - start of an atomic group to prevent backtracking into it
    • [^()] - any char other than ( and )
    • | - or
    • (?<o>)\( - matches a literal ( and pushes an empty value into stack "o"
    • | - or
    • (?<-o>)\) - matches a literal ) and removes one value from stack "o"
  • )* - zero or more occurrences of the atomic group are matched
  • \) - a literal )
  • (?(o)(?!)) - a conditional construct failing the match if stack "o" contains values (is not empty).

See the regex demo.

var input = "X1 OR ( X2 AND( X3 AND X4 ) AND X5 ) OR X6";
var filtered = Regex.Replace(input, @"\((?>[^()]|(?<o>)\(|(?<-o>)\))*\)(?(o)(?!))", string.Empty);
var result = filtered.Split(new[] { ' ' }, 
    StringSplitOptions.RemoveEmptyEntries)
    .Where(element => element == "OR" || element == "AND");    
var temp = string.Join(" ", result);

See the C# demo



来源:https://stackoverflow.com/questions/38713119/c-sharp-regex-for-matching-sepcific-text-inside-nested-parentheses

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!