regex to replace words with more than two consecutive characters

后端 未结 3 723
说谎
说谎 2021-01-17 05:25

How can I detect the presence of more than two consecutive characters in a word and remove that word?

I seem to be able to do it like this:

# example         


        
相关标签:
3条回答
  • 2021-01-17 05:36

    You can use grepl instead.

    mystring <- c(1, 2, 3, "toot", "tooooot", "good", "apple", "banana")
    mystring[!grepl("(.)\\1{2,}", mystring)]
    ## [1] "1"      "2"      "3"      "toot"   "good"   "apple"  "banana"
    

    ** Explanation**
    \\1 matches first group (in this case (.) ). {2,} specifies that preceding character should be matched atleast 2 times or more. Since we want to match any character repeated 3 times or more - (.) is first occurrence, \\1 needs to be matched 2 times ore more.

    0 讨论(0)
  • 2021-01-17 05:45

    Combine the expressions like so:

    gsub("^[[:alpha:]]*([[:alpha:]])\\1\\1[[:alpha:]]*$", "", mystring)
    
    0 讨论(0)
  • 2021-01-17 05:51

    An other possibility :

    mystring[grepl("(.{1})\\1{2,}", mystring, perl=T)] <- ""
    
    0 讨论(0)
提交回复
热议问题