Subset multiple rows with condition

后端 未结 3 1189
星月不相逢
星月不相逢 2021-01-27 11:47

I have a .txt file read into a table called power with over 2 million observations of 9 variables. I am trying to subset power

3条回答
  •  醉话见心
    2021-01-27 12:17

    I am guessing that your dataset may have trailing/leading spaces for the column because

    subset(power, Date %in% c("01/02/2007", "02/02/2007"))
    #       Date Val
    #1 01/02/2007  14
    #8 02/02/2007  28
    

    If I change the rows to

    power$Date[1] <- '01/02/2007 '
    power$Date[8] <- ' 02/02/2007'
    
    subset(power, Date %in% c("01/02/2007", "02/02/2007"))
    #[1] Date Val 
    <0 rows> (or 0-length row.names)
    

    You could use str_trim from stringr

    library(stringr)
    subset(power, str_trim(Date) %in% c('01/02/2007', '02/02/2007'))
    #         Date Val
    #1 01/02/2007   14
    #8  02/02/2007  28
    

    or use gsub

    subset(power, gsub("^ +| +$", "", Date) %in% c('01/02/2007', '02/02/2007'))
    #         Date Val
    #1 01/02/2007   14
    #8  02/02/2007  28
    

    or another option without removing the spaces would be to use grep

    subset(power, grepl('01/02/2007|02/02/2007', Date))
    #         Date Val
    #1 01/02/2007   14
    #8  02/02/2007  28
    

    data

    power <- structure(list(Date = c("01/02/2007", "16/12/2006", "16/12/2006", 
    "16/12/2006", "16/12/2006", "16/12/2006", "16/12/2006", "02/02/2007"
    ), Val = c(14L, 24L, 23L, 22L, 23L, 25L, 23L, 28L)), .Names = c("Date", 
    "Val"), class = "data.frame", row.names = c(NA, -8L))
    

提交回复
热议问题