Greedy, Non-Greedy, All-Greedy Matching in C# Regex

核能气质少年 提交于 2019-11-28 02:32:53

问题


How can I get all the matches in the following example:

// Only "abcd" is matched
MatchCollection greedyMatches = Regex.Matches("abcd", @"ab.*");

// Only "ab" is matched
MatchCollection lazyMatches   = Regex.Matches("abcd", @"ab.*?");

// How can I get all matches: "ab", "abc", "abcd"

P.S.: I want to have the all matches in a generic manner. The example above is just an example.


回答1:


You could use something like:

MatchCollection nonGreedyMatches = Regex.Matches("abcd", @"(((ab)c)d)");

Then you should have three backreferences with ab, abc and abcd.

But, to be honest, this kind of regex doesn't makes too much sense, especially when it gets bigger it becomes unreadable.

Edit:

MatchCollection nonGreedyMatches = Regex.Matches("abcd", @"ab.?");

And you got an error there btw. This can only match ab and abc (read: ab + any (optional) character

Lazy version of:

MatchCollection greedyMatches    = Regex.Matches("abcd", @"ab.*");

is:

MatchCollection nonGreedyMatches    = Regex.Matches("abcd", @"ab.*?");



回答2:


If a solution exists, it probably involves a capturing group and the RightToLeft option:

string s = @"abcd";
Regex r = new Regex(@"(?<=^(ab.*)).*?", RegexOptions.RightToLeft);
foreach (Match m in r.Matches(s))
{
  Console.WriteLine(m.Groups[1].Value);
}

output:

abcd
abc
ab

I say "if" because, while it works for your simple test case, I can't guarantee this trick will help with your real-world problem. RightToLeft mode is one of .NET's more innovative features--offhand, I can't think of another flavor that has anything equivalent to it. The official documentation on it is sparse (to put it mildly), and so far there don't seem to be a lot developers using it and sharing their experiences online. So try it and see what happens.




回答3:


You can't get three different results from only one match.

If you want to match only "ab" you can use ab.? or a.{1} (or a lot of other options)
If you want to match only "abc" you can use ab. or a.{2} (or a lot of other options)
If you want to match only "abcd" you can use ab.* or a.{3} (or a lot of other options)



来源:https://stackoverflow.com/questions/3898210/greedy-non-greedy-all-greedy-matching-in-c-sharp-regex

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!