r Remove parts of column name after certain characters

后端 未结 3 443
既然无缘
既然无缘 2020-12-30 07:55

I have a large data set with thousands of columns. The column names include various unwanted characters as follows:

col1_3x_xxx
col2_3y_xyz
col3_3z_zyx


        
相关标签:
3条回答
  • 2020-12-30 08:08

    certainly late for this answer, but just in case someone is looking for a solution

    colnames(df1)[col] <-  sub("_3.*", "", colnames(df1)[col])
    

    And if you have multiple columns :

    for ( col in 1:ncol(df1)){
        colnames(df1)[col] <-  sub("_3.*", "", colnames(df1)[col])
    }
    
    0 讨论(0)
  • 2020-12-30 08:16

    We can try the str_extract with regular expression pattern "^[^_]+(?=_)":

    stringr::str_extract(c("col1_3x_xxx", "col2_3y_xyz", "col3_3z_zyx"), "^[^_]+(?=_)")
    [1] "col1" "col2" "col3"
    

    where in the pattern:

    The first ^ matches the beginning of the string; [^_]+ matches one or more non _ character, ^_ means any character but _. (?=...) stands for lookahead, so we are looking for pattern ahead of _.

    0 讨论(0)
  • 2020-12-30 08:17

    We can use sub

    sub("_3.*", "", df1[,1])
    #[1] "col1" "col2" "col3"
    
    0 讨论(0)
提交回复
热议问题