regex to match repeated consonant

后端 未结 6 1789
悲&欢浪女
悲&欢浪女 2021-02-04 03:28

How can I detect with a regex expression if the same consonant is repeated three times or more?

My idea is to match words like tttoo

相关标签:
6条回答
  • 2021-02-04 03:41

    Try this:

    ([b-df-hj-np-tv-z])\1{2,}
    

    Explanation:

    • [b-df-hj-np-tv-z] are all the consonants
    • \1 is the back reference to the 1st group (ie the same character)
    • {2,} means "2 or more of the preceding term", making 3 or more in all

    See live demo.

    0 讨论(0)
  • 2021-02-04 03:44

    There may be shortcuts in certain regex libraries but you can always...

    b{3,}|c{3,}|d{3,}...
    

    Some libs for example let you match using a back reference which may be a tad cleaner...

    (bcd...)\1{2,}
    
    0 讨论(0)
  • 2021-02-04 03:46

    This is about the shortest regex I could think of to do it:

    (?i)([b-z&&[^eiou]])\1\1+
    

    This uses a regex character class subtraction to exclude vowels.
    I didn't have to mention "a" because I started the range from "b".
    Using (?i) makes the regex case insensitive.

    See a live demo.

    0 讨论(0)
  • 2021-02-04 03:49

    The regex from answer higher [b-df-hj-np-tv-z])\1{2,}has a mistake ("y" is fogotten)

    It should be [b-df-hj-np-tv-xz])\1{2,}

    0 讨论(0)
  • 2021-02-04 03:52

    I'd personally solve this in reverse; instead of using [b-df-hj-np-tv-z], I'd go with the double-negative, [^\W_aeiou].

    /([^\W_aeiou])\1\1+/i
    

    This has a character class that uses a double negative: match anything except a non-word-character, an underscore, or a vowel. Ignoring non-ASCII vowels, only consonants can match this. Saving a match, the regex then seeks a match of that same consonant (case-insensitive), then one or more again, which brings us to 3+ consecutive consonants.

    0 讨论(0)
  • 2021-02-04 03:55

    You can use capture groups with back-references. This will capture repeating symbols:

    /(
       ([\w])        ## second group is just one symbol
       \2            ## match symbol found in second groups
       \2+           ## match same symbol one or more times
    )/x              ## x is just to allow inner comments
    

    But not all regexp engines support back-references.

    0 讨论(0)
提交回复
热议问题