String split with conditions in R

后端 未结 7 1352
一向
一向 2021-02-04 00:05

I have this mystring with the delimiter _. The condition here is if there are two or more delimiters, I want to split at the second delimiter and if th

7条回答
  •  挽巷
    挽巷 (楼主)
    2021-02-04 00:42

    Perl/PCRE has the branch reset feature that lets you reuse a group number when you have capturing groups in different alternatives, and is considered as one capturing group.

    IMO, this feature is elegant when you want to supply different alternatives.

    x <- c('MODY_60.2.ReCal.sort.bam', 'MODY_116.21_C4U.ReCal.sort.bam', 
           'MODY_116.3_C2RX-1-10.ReCal.sort.bam', 'MODY_116.4.ReCal.sort.bam',
           'MODY_116.4_asdfsadf_1212_asfsdf', 'MODY_116.5.ReCal_asdfsadf_1212_asfsdf', 'MODY')
    
    sub('^(?|([^_]*_[^_]*)_.*|(.*)\\.ReCal.*)$', '\\1', x, perl=T)
    # [1] "MODY_60.2"        "MODY_116.21"      "MODY_116.3"       "MODY_116.4"      
    # [5] "MODY_116.4"       "MODY_116.5.ReCal" "MODY"  
    

提交回复
热议问题