Remove strings found in vector 1, from vector 2

前端 未结 2 1579
攒了一身酷
攒了一身酷 2020-12-04 03:39

I have these two vectors:

sample1 <- c(\".aaa\", \".aarp\", \".abb\", \".abbott\", \".abogado\")
sample2 <- c(\"try1.aarp\", \"www.tryagain.aaa\", \"25         


        
相关标签:
2条回答
  • 2020-12-04 04:13

    We can paste the 'sample1' elements together, use that as the pattern argument in gsub, replace it with ''.

    gsub(paste(sample1, collapse='|'), '', sample2)
    #[1] "try1"            "www.tryagain"    "255.255.255.255" "onemoretry"  
    

    Or use mgsub

    library(qdap)
    mgsub(sample1, '', sample2)
    #[1] "try1"            "www.tryagain"    "255.255.255.255" "onemoretry"     
    
    0 讨论(0)
  • 2020-12-04 04:35

    Try this,

    sample1 <- c(".aaa", ".aarp", ".abb", ".abbott", ".abogado")
    sample2 <- c("try1.aarp", "www.tryagain.aaa", "255.255.255.255", "onemoretry.abb.abogado")
    paste0("(",paste(sub("\\.", "\\\\.", sample1), collapse="|"),")\\b")
    # [1] "(\\.aaa|\\.aarp|\\.abb|\\.abbott|\\.abogado)\\b"
    gsub(paste0("(",paste(sub("\\.", "\\\\.", sample1), collapse="|"),")\\b"), "", sample2)
    # [1] "try1"            "www.tryagain"    "255.255.255.255" "onemoretry" 
    

    Explanation:

    • sub("\\.", "\\\\.", sample1) escapes all the dots. Since dots are special chars in regex.

    • paste(sub("\\.", "\\\\.", sample1), collapse="|") combines all the elements with | as delimiter.

    • paste0("(",paste(sub("\\.", "\\\\.", sample1), collapse="|"),")\\b") creates a regex like all the elements present inside a capturing group followed by a word boundary. \\b is a much needed one here . So that it would do an exact word match.

    0 讨论(0)
提交回复
热议问题