Regex: Matching against groups in different order without repeating the group

后端 未结 4 1287
遇见更好的自我
遇见更好的自我 2021-01-31 11:21

Let\'s say I have two strings like this:

XABY
XBAY

A simple regex that matches both would go like this:

X(AB|BA)Y
4条回答
  •  梦如初夏
    2021-01-31 11:32

    X(?:A()|B()){2}\1\2Y
    

    Basically, you use an empty capturing group to check off each item when it's matched, then the back-references ensure that everything's been checked off.

    Be aware that this relies on undocumented regex behavior, so there's no guarantee that it will work in your regex flavor--and if it does, there's no guarantee that it will continue to work as that flavor evolves. But as far as I know, it works in every flavor that supports back-references. (EDIT: It does not work in JavaScript.)

    EDIT: You say you're using named groups to capture parts of the match, which adds a lot of visual clutter to the regex, if not real complexity. Well, if you happen to be using .NET regexes, you can still use simple numbered groups for the "check boxes". Here's a simplistic example that finds and picks apart a bunch of month-day strings without knowing their internal order:

      Regex r = new Regex(
        @"(?:
            (?Jan|Feb|Mar|Apr|May|Jun|Jul|Sep|Oct|Nov|Dec)()
            |
            (?\d+)()
          ){2}
          \1\2",
        RegexOptions.IgnorePatternWhitespace);
    
      string input = @"30Jan Feb12 Mar23 4Apr May09 11Jun";
      foreach (Match m in r.Matches(input))
      {
        Console.WriteLine("{0} {1}", m.Groups["MONTH"], m.Groups["DAY"]);
      }
    

    This works because in .NET, the presence of named groups has no effect on the ordering of the non-named groups. Named groups have numbers assigned to them, but those numbers start after the last of the non-named groups. (I know that seems gratuitously complicated, but there are good reasons for doing it that way.)

    Normally you want to avoid using named and non-named capturing groups together, especially if you're using back-references, but I think this case could be a legitimate exception.

提交回复
热议问题