Remove the letters between two patterns of strings in R

后端 未结 3 1687
花落未央
花落未央 2020-12-06 01:58

How can I remove the letters between two specific patterns in R?

For instance

a= \"a#g abcdefgtdkfef_jpg>pple\"

I would like to

相关标签:
3条回答
  • 2020-12-06 02:35

    Adding to the previous replies, if you work with a string that looks like "a#g abcdefgtdkfef_jpg>pple ; #__something_else___jpg>", some of these methods will sub the whole string with an expression like "#.*jpg>", and you will get an empty string as a result. To avoid that, you can use R regex "#[^jpg>]+jpg>" that will allow you to match the pattern more selectively.

    0 讨论(0)
  • 2020-12-06 02:38

    There's no need to load a package for this operation. You can use the base R function sub. It's used to match the first occurrence of a regular expression.

    a <- "a#g abcdefgtdkfef_jpg>pple"
    sub("#g.*jpg>", "", a)
    # [1] "apple"
    

    Regular expression explained:

    • #g matches "#g"
    • .* matches any character except \n (zero or more times)
    • jpg> matches "jpg>"

    So here we're removing everything starting at #g up to and including jpg>


    In regards to your comment

    I tried to find some function in stringR but I couldn't

    It's actually spelled stringr (case-sensitive). You could use str_replace.

    library(stringr)
    str_replace(a, "#g.*jpg>", "")
    # [1] "apple"
    
    0 讨论(0)
  • I wanted to add to Rich's answer because it does not work when multiple replacements need to be done in the same text.

    If you want to remove multiple times in the same string you need to tweak the code a bit:

    a <- "a#g abcdefgtdkfef_jpg>pple
    or#g abcdefgtdkfef_jpg>ange
    ma#g abcdefgtdkfef_jpg>ngo"
    
    # Code to get the individual fruits
    gsub("#g.*?jpg>", "", a)
    
    # Output
    # [1] "apple orange mango"
    
    0 讨论(0)
提交回复
热议问题