extract part of a file name in R

后端 未结 1 1527
终归单人心
终归单人心 2021-01-25 08:33

I\'m trying to write some code to open all the data files in a folder, apply a function (or set of functions) to extract my data of interest. So far, so good. The problem is t

相关标签:
1条回答
  • 2021-01-25 09:29

    The pattern here is a date, an optional E\digit or Expt\digit that you don't want, a word that you do want, then an optional SDM that you don't want followed by 'data copy.txt'...

    Here's my test data:

    > names
    [1] "2012-05-31 CTN1 data copy.txt"          
    [2] "2012-05-21 E7 PMA1 data copy.txt"       
    [3] "2011-11-29 TDH3 SDM data copy.txt"      
    [4] "2012-01-04 POX1 data copy.txt"          
    [5] "2011-11-29 ECHO data copy.txt"          
    [6] "2011-11-29 E8 ECHO data copy.txt"       
    [7] "2011-11-29 ECHO SDM data copy.txt"      
    [8] "2011-11-29 Expt2 ECHO SDM data copy.txt"
    

    and here's my sub:

    > sub(pattern="^....-..-.. (E\\d+ |Expt\\d+ )*(\\w+) (SDM )*data copy.txt","\\2",names)
    [1] "CTN1" "PMA1" "TDH3" "POX1" "ECHO" "ECHO" "ECHO" "ECHO"
    

    If your E-prefixes have more than one digit this will also work. I've tried to add some things to my test set starting with E to make sure they get treated properly, as well as the case of an E-prefix and an SDM.

    0 讨论(0)
提交回复
热议问题