Removing duplicate words in a string in R

后端 未结 4 1081
栀梦
栀梦 2020-12-11 03:47

Just to help someone who\'s just voluntarily removed their question, following a request for code he tried and other comments. Let\'s assume they tried something like this:

4条回答
  •  有刺的猬
    2020-12-11 04:19

    I'm not sure if string case is a concern. This solution uses qdap with the add-on qdapRegex package to make sure that punctuation and beginning string case doesn't interfere with the removal but is maintained:

    str <- c("How do I best try and try and try and find a way to to improve this code?",
        "And and here's a second one one and not a third One.")
    
    library(qdap)
    library(dplyr) # so that pipe function (%>% can work) 
    
    str %>% 
        tolower() %>%
        word_split() %>% 
        sapply(., function(x) unbag(unique(x))) %>% 
        rm_white_endmark() %>%  
        rm_default(pattern="(^[a-z]{1})", replacement = "\\U\\1") %>%
        unname()
    
    ## [1] "How do i best try and find a way to improve this code?"
    ## [2] "And here's a second one not third."
    

提交回复
热议问题