Split vector of strings and paste subset of resulting elements into a new vector

前端 未结 4 1000
悲&欢浪女
悲&欢浪女 2021-02-19 23:52

Define

z<- as.character(c(\"1_xx xx xxx_xxxx_12_sep.xls\",\"2_xx xx xxx_xxxx_15_aug.xls\"))

such that

> z
[1] \"1_xx xx x         


        
4条回答
  •  时光取名叫无心
    2021-02-20 00:34

    Using a bit of magic in the stringr package: I separately extract the left and right date fields, combine them, and finally remove the .xls at the end.

    library(stringr)
    l <- str_extract(z, "\\d+_")
    r <- str_extract(z, "\\d+_\\w*\\.xls")
    gsub(".xls", "", paste(l, r, sep=""))
    
    [1] "1_12_sep" "2_15_aug"
    

    str_extract is a wrapper around some of the base R functions which I find easier to use.

    Edit Here is a short explanation of what the regex does:

    • \\d+ looks for one or more digits. It is escaped to distinguish from a normal character d.
    • \\w* looks for zero or more alphanumeric characters (word). Again, it's escaped.
    • \\. looks for a decimal point. This needs to be escaped because otherwise the decimal point means any single character.

    In theory the regex should be quite flexible. It should find single or double characters for your dates.

提交回复
热议问题