Subset multiple rows with condition

后端 未结 3 1190
星月不相逢
星月不相逢 2021-01-27 11:47

I have a .txt file read into a table called power with over 2 million observations of 9 variables. I am trying to subset power

相关标签:
3条回答
  • 2021-01-27 12:17

    I am guessing that your dataset may have trailing/leading spaces for the column because

    subset(power, Date %in% c("01/02/2007", "02/02/2007"))
    #       Date Val
    #1 01/02/2007  14
    #8 02/02/2007  28
    

    If I change the rows to

    power$Date[1] <- '01/02/2007 '
    power$Date[8] <- ' 02/02/2007'
    
    subset(power, Date %in% c("01/02/2007", "02/02/2007"))
    #[1] Date Val 
    <0 rows> (or 0-length row.names)
    

    You could use str_trim from stringr

    library(stringr)
    subset(power, str_trim(Date) %in% c('01/02/2007', '02/02/2007'))
    #         Date Val
    #1 01/02/2007   14
    #8  02/02/2007  28
    

    or use gsub

    subset(power, gsub("^ +| +$", "", Date) %in% c('01/02/2007', '02/02/2007'))
    #         Date Val
    #1 01/02/2007   14
    #8  02/02/2007  28
    

    or another option without removing the spaces would be to use grep

    subset(power, grepl('01/02/2007|02/02/2007', Date))
    #         Date Val
    #1 01/02/2007   14
    #8  02/02/2007  28
    

    data

    power <- structure(list(Date = c("01/02/2007", "16/12/2006", "16/12/2006", 
    "16/12/2006", "16/12/2006", "16/12/2006", "16/12/2006", "02/02/2007"
    ), Val = c(14L, 24L, 23L, 22L, 23L, 25L, 23L, 28L)), .Names = c("Date", 
    "Val"), class = "data.frame", row.names = c(NA, -8L))
    
    0 讨论(0)
  • 2021-01-27 12:20

    Try:

    > subpower = power[power$Date %in% c("01/02/2007", "02/02/2007") ,]
    > subpower
            Date Val
    1 01/02/2007  14
    8 02/02/2007  28
    

    (Using power data from @akrun's answer)

    Moreover, your own code will work if you use proper name of subset: "subpower" instead of "powersub"!

    > subpower <- subset(power, Date %in% c("01/02/2007", "02/02/2007"))
    > subpower
            Date Val
    1 01/02/2007  14
    8 02/02/2007  28
    >
    > str(subpower)
    'data.frame':   2 obs. of  2 variables:
     $ Date: chr  "01/02/2007" "02/02/2007"
     $ Val : int  14 28
    
    0 讨论(0)
  • 2021-01-27 12:24

    Your approach is correct, try reading in the text file with

    power <- read.table("textfile.txt", stringsAsFactors = FALSE)
    
    0 讨论(0)
提交回复
热议问题