I have a .txt
file read into a table
called power
with over 2 million observations of 9 variables. I am trying to subset power
I am guessing that your dataset may have trailing/leading
spaces for the column because
subset(power, Date %in% c("01/02/2007", "02/02/2007"))
# Date Val
#1 01/02/2007 14
#8 02/02/2007 28
If I change the rows to
power$Date[1] <- '01/02/2007 '
power$Date[8] <- ' 02/02/2007'
subset(power, Date %in% c("01/02/2007", "02/02/2007"))
#[1] Date Val
<0 rows> (or 0-length row.names)
You could use str_trim
from stringr
library(stringr)
subset(power, str_trim(Date) %in% c('01/02/2007', '02/02/2007'))
# Date Val
#1 01/02/2007 14
#8 02/02/2007 28
or use gsub
subset(power, gsub("^ +| +$", "", Date) %in% c('01/02/2007', '02/02/2007'))
# Date Val
#1 01/02/2007 14
#8 02/02/2007 28
or another option without removing the spaces would be to use grep
subset(power, grepl('01/02/2007|02/02/2007', Date))
# Date Val
#1 01/02/2007 14
#8 02/02/2007 28
power <- structure(list(Date = c("01/02/2007", "16/12/2006", "16/12/2006",
"16/12/2006", "16/12/2006", "16/12/2006", "16/12/2006", "02/02/2007"
), Val = c(14L, 24L, 23L, 22L, 23L, 25L, 23L, 28L)), .Names = c("Date",
"Val"), class = "data.frame", row.names = c(NA, -8L))
Try:
> subpower = power[power$Date %in% c("01/02/2007", "02/02/2007") ,]
> subpower
Date Val
1 01/02/2007 14
8 02/02/2007 28
(Using power data from @akrun's answer)
Moreover, your own code will work if you use proper name of subset: "subpower" instead of "powersub"!
> subpower <- subset(power, Date %in% c("01/02/2007", "02/02/2007"))
> subpower
Date Val
1 01/02/2007 14
8 02/02/2007 28
>
> str(subpower)
'data.frame': 2 obs. of 2 variables:
$ Date: chr "01/02/2007" "02/02/2007"
$ Val : int 14 28
Your approach is correct, try reading in the text file with
power <- read.table("textfile.txt", stringsAsFactors = FALSE)