String split with conditions in R

后端 未结 7 1378
一向
一向 2021-02-04 00:05

I have this mystring with the delimiter _. The condition here is if there are two or more delimiters, I want to split at the second delimiter and if th

7条回答
  •  无人及你
    2021-02-04 01:08

    You can do this using gsubfn

    library(gsubfn)
    f <- function(x,y,z) if (z=="_") y else strsplit(x, ".ReCal", fixed=T)[[1]][[1]]
    gsubfn("([^_]+_[^_]+)(.).*", f, mystring, backref=2)
    # [1] "MODY_60.2"   "MODY_116.21" "MODY_116.3"  "MODY_116.4" 
    

    This allows for cases when you have more than two "_", and you want to split on the second one, for example,

    mystring<-c("MODY_60.2.ReCal.sort.bam",
                "MODY_116.21_C4U.ReCal.sort.bam",
                "MODY_116.3_C2RX-1-10.ReCal.sort.bam",
                "MODY_116.4.ReCal.sort.bam",
                "MODY_116.4_asdfsadf_1212_asfsdf",
                "MODY_116.5.ReCal_asdfsadf_1212_asfsdf",  # split by second "_", leaving ".ReCal"
                "MODY")
    
    gsubfn("([^_]+_[^_]+)(.).*", f, mystring, backref=2)
    # [1] "MODY_60.2"        "MODY_116.21"      "MODY_116.3"       "MODY_116.4"      
    # [5] "MODY_116.4"       "MODY_116.5.ReCal" "MODY"            
    

    In the function, f, x is the original string, y and z are the next matches. So, if z is not a "_", then it proceeds with the splitting by the alternative string.

提交回复
热议问题