Regex 'or' operator avoid repetition

后端 未结 4 2328
轮回少年
轮回少年 2021-02-20 03:16

How can I use the or operator while not allowing repetition? In other words the regex:

(word1|word2|word3)+

will match wo

相关标签:
4条回答
  • 2021-02-20 03:39

    You could use a negative look-ahead containing a back reference:

    ^(?:(word1|word2|word3)(?!.*\1))+$
    

    where \1 refers to the match of the capture group (word1|word2|word3).

    Note that this assumes word2 cannot be formed by appending characters to word1, and that word3 cannot be formed by appending characters to word1 or word2.

    0 讨论(0)
  • 2021-02-20 03:42

    Byers' solution is too hard coded and gets quite cumbersome after the letters increases.. Why not simply have the regex look for duplicate match?

    ([^\d]+\d)+(?=.*\1)
    

    If that matches, that match signifies that a repetition has been found in the pattern. If the match doesn't work you have a valid set of data.

    0 讨论(0)
  • 2021-02-20 03:47

    You could use negative lookaheads:

    ^(?:word1(?!.*word1)|word2(?!.*word2)|word3(?!.*word3))+$
    

    See it working online: rubular

    0 讨论(0)
  • 2021-02-20 03:48

    The lookahead solutions will not work in several cases, you can solve this properly, without lookarounds, by using a construct like this:

    (?:(?(1)(?!))(word1)|(?(2)(?!))(word2)|(?(3)(?!))(word3))+
    

    This works even if some words are substrings of others and will also work if you just want to find the matching substrings of a larger string (and not only match whole string).

    Live demo.

    It simply works by failing the alteration if it has been matched previously, done by (?(1)(?!)). (?(1)foo) is a conditional, and will match foo if group 1 has previously matched. (?!) always fails.

    0 讨论(0)
提交回复
热议问题