Trim part of a string in dataframe

后端 未结 3 500
独厮守ぢ
独厮守ぢ 2020-12-21 13:21

If I have a dataframe structure like that:

AA1_123.zip
BB2_456.txt
CCC_789.doc

How can I change it to this:

AA1
BB2
CCC


        
相关标签:
3条回答
  • 2020-12-21 13:29

    You could also read the column again, using comment.char = "_" to flush the rest of the line. Y

    df <- data.frame(x = c("AA1_123.zip", "BB2_456.txt", "CCC_789.doc"))
    
    read.table(text = as.character(df$x), comment.char="_")
    #    V1
    # 1 AA1
    # 2 BB2
    # 3 CCC
    

    Or you can use scan()

    scan(text = as.character(df$x), what = "", comment.char="_")
    # Read 3 items
    # [1] "AA1" "BB2" "CCC"
    
    0 讨论(0)
  • 2020-12-21 13:37

    You could try sub

    sub('_.*', '', df1$Col)
    #[1] "AA1" "BB2" "CCC"
    

    data

    df1 <- structure(list(Col = c("AA1_123.zip", "BB2_456.txt", 
    "CCC_789.doc"
    )), .Names = "Col", class = "data.frame", row.names = c(NA, -3L))
    
    0 讨论(0)
  • 2020-12-21 13:39

    If the strings are all the same style at the start, three characters before the underline, this will work:

    df1 <- structure(list(Col = c("AA1_123.zip", "BB2_456.txt", 
                                  "CCC_789.doc"
    )), .Names = "Col", class = "data.frame", row.names = c(NA, -3L))
    
    > substr(df1$Col, 1, 3)
    [1] "AA1" "BB2" "CCC"
    
    0 讨论(0)
提交回复
热议问题