问题
I have these code lines for take to operators between parentheses:
string filtered = Regex.Replace(input, "\\(.*?\\)", string.Empty);
var result = filtered.Split(new[] { ' ' },
StringSplitOptions.RemoveEmptyEntries)
.Where(element => element == "OR" || element == "AND");
string temp = string.Join(" ", result);
These lines do not work for nested parentheses.
For example; it is working for this input :
X1 OR ( X2 AND X3 AND X4 AND X5 ) OR X6
It give me this result: OR OR
But, when my input has more than one nested parentheses, it works wrongly.
For this input:
X1 OR ( X2 AND( X3 AND X4 ) AND X5 ) OR X6
I want to take for result OR OR but it prints OR AND OR.
Although there are two (
characters in string, when it ends processing after matching the first )
character.
How can I adjust my regex pattern?
回答1:
Your \(.*?\)
regex contains 3 parts: 1) \(
matching a literal (
, 2) .*?
lazy dot matching pattern (that matches 0+ any characters other than a newline, as few as possible, up to the first )
, and 3) a \)
matching a literal )
.
Use balancing construct if your strings cannot have escaped sequences:
@"\((?>[^()]|(?<o>)\(|(?<-o>)\))*\)(?(o)(?!))"
The point here is that the expression should not be enclosed with any anchors (as in What are regular expression Balancing Groups).
Details:
\(
- a literal(
(?>
- start of an atomic group to prevent backtracking into it[^()]
- any char other than(
and)
|
- or(?<o>)\(
- matches a literal(
and pushes an empty value into stack "o"|
- or(?<-o>)\)
- matches a literal)
and removes one value from stack "o"
)*
- zero or more occurrences of the atomic group are matched\)
- a literal)
(?(o)(?!))
- a conditional construct failing the match if stack "o" contains values (is not empty).
See the regex demo.
var input = "X1 OR ( X2 AND( X3 AND X4 ) AND X5 ) OR X6";
var filtered = Regex.Replace(input, @"\((?>[^()]|(?<o>)\(|(?<-o>)\))*\)(?(o)(?!))", string.Empty);
var result = filtered.Split(new[] { ' ' },
StringSplitOptions.RemoveEmptyEntries)
.Where(element => element == "OR" || element == "AND");
var temp = string.Join(" ", result);
See the C# demo
来源:https://stackoverflow.com/questions/38713119/c-sharp-regex-for-matching-sepcific-text-inside-nested-parentheses