Removing characters after a EURO symbol in R

前端 未结 2 696
时光取名叫无心
时光取名叫无心 2021-01-22 18:22

I have a euro symbol saved in \"euro\" variable:

euro <- \"\\u20AC\"
euro
#[1] \"€\"

And \"eurosearch\" variable contains \"services as defi

相关标签:
2条回答
  • 2021-01-22 18:42

    You can use variables in the pattern by just concatenating strings using paste0:

    euro <- "€"
    eurosearch <- "services as defined in this SOW at a price of € 15,896.80 (if executed fro"
    sub(paste0("^.*", gsub("([^A-Za-z_0-9])", "\\\\\\1", euro), "\\s*(\\S+).*"), "\\1", eurosearch)
    
    euro <- "$"
    eurosearch <- "services as defined in this SOW at a price of $ 25,196.4 (if executed fro"
    sub(paste0("^.*", gsub("([^A-Za-z_0-9])", "\\\\\\1", euro), "\\s*(\\S+).*"), "\\1", eurosearch)
    

    See CodingGround demo

    Note that with gsub("([^A-Za-z_0-9])", "\\\\\\1", euro) I am escaping any non-word symbols so that $ could be treated as a literal, not a special regex metacharacter (taken from this SO post).

    0 讨论(0)
  • 2021-01-22 18:43

    Use regmatches present in base r or str_extarct in stringr, etc

    > x <- "services as defined in this SOW at a price of € 15,896.80 (if executed fro"
    > regmatches(x, regexpr("(?<=€ )\\S+", x, perl=T))
    [1] "15,896.80"
    

    or

    > gsub("€ (\\S+)|.", "\\1", x)
    [1] "15,896.80"
    

    or

    Using variables.

    euro <- "\u20AC"
    gsub(paste(euro , "(\\S+)|."), "\\1", x) 
    

    If this answer of using variables won't work for you then you need to set the encoding,

    gsub(paste(euro , "(\\S+)|."), "\\1", `Encoding<-`(x, "UTF8"))
    

    Source

    0 讨论(0)
提交回复
热议问题