Simplifying the regex “ab|a|b”

后端 未结 1 815
时光说笑
时光说笑 2021-01-21 09:57

(How) could the following regex be simplified:

ab|a|b

?

I\'m looking for a less redundant on

1条回答
  •  -上瘾入骨i
    2021-01-21 10:22

    If you are using Perl or some PCRE engine (like PHP's preg_ functions), you can refer to previous groups in the pattern, like this:

    /(a)(b)|(?1)|(?2)/
    

    The main purpose of this feature is to support recursion, but it can be used for pattern reuse as well.

    Note that in this case you cannot get around capturing a and b in the first alternation, which incurs some (possibly) unnecessary overhead. To avoid this, you can define the groups inside a conditional that is never executed. The canonical way to do this is to use (?(DEFINE)...) group (which checks if a named DEFINE group matched anything, but of course that group doesn't exist):

    /(?(DEFINE)(a)(b))(?1)(?2)|(?1)|(?2)/
    

    If your engine doesn't support that (EDIT: since you are using Java, no this feature is not supported), the best you can get in a single pattern is indeed

    ab?|b
    

    Alternatively, you can build the ab|a|b version manually by string concatenation/formatting like:

    String a = "a";
    String b = "b";
    String pattern = a + b + "|" + a + "|" + b;
    

    This avoids the duplication as well. Or you can use 3 separate patterns ab, a and b against the subject string (where the first one is again a concatenation of the latter two).

    0 讨论(0)
提交回复
热议问题