问题
Disclaimer: I am going to come out of this looking silly.
I have a data frame containing a column which has a date of class POSIXct
. I am trying to remove some of the rows containing specific dates- public holidays. I tried to do that using this:
> modelset.nonholiday <- modelset[!modelset$date == as.POSIXct("2013-12-31")|
!modelset$date ==as.POSIXct("2013-07-04") |
!modelset$date == as.POSIXct("2014-07-04")|
!modelset$date == as.POSIXct ("2013-11-28") |
!modelset$date == as.POSIXct ("2013-11-29") |
!modelset$date == as.POSIXct ("2013-12-24") |
!modelset$date == as.POSIXct ("2013-12-25") |
!modelset$date == as.POSIXct ("2014-02-14") |
!modelset$date == as.POSIXct ("2014-04-20") |
!modelset$date == as.POSIXct ("2014-05-26"), ]
The above didn't work. It returns the data frame removing only the first So I tried :
modelset[!modelset$date %in% c("2013-12-31", "2013-07-04", "2014-07-04",
"2013-11-28", "2013-11-29", "2013-12-24", "2013-12-25", "2014-02-14",
"2014-04-20", "2014-05-26"), ]
This didn't work either. I also tried:
`%notin%` <- function(x,y) !(x %in% y)
modelset[modelset$date %notin% as.POSIXct(c("2013-12-31", "2013-07-04", "2014-07-04",
"2013-11-28", "2013-11-29", "2013-12-24", "2013-12-25", "2014-02-14",
"2014-04-20", "2014-05-26")), ]`
I've referred Remove Rows From Data Frame where a Row match a String, R remove rows containing a certain value, and Standard way to remove multiple elements from a dataframe but can't seem to find what I am doing wrong.
> head(modelset)
date spot.volume.loc spot.volume.nat nat.imp.a loc.imp.a nat.imp.m loc.imp.m branded.leads esi.leads
1 2013-07-01 2988 215 13931 4155.3 5770 1853.7 331 363
2 2013-07-02 3200 218 12589 4651.3 5374 2207.8 293 428
3 2013-07-03 3066 203 10305 3921.0 4754 1759.2 273 325
4 2013-07-04 3153 83 2353 4135.6 999 1912.2 172 184
5 2013-07-05 2959 59 1553 3573.4 815 1662.3 193 246
6 2013-07-06 667 53 2219 456.7 889 214.8 161 203
tv.leads callin.leads total.leads total.imp.a total.imp.m day week quarter on.off
1 195 41 930 18086.3 7623.7 Monday 26 Q3 1.25
2 192 50 963 17240.3 7581.8 Tuesday 26 Q3 1.00
3 149 38 785 14226.0 6513.2 Wednesday 26 Q3 1.00
4 34 0 390 6488.6 2911.2 Thursday 26 Q3 1.00
5 50 18 507 5126.4 2477.3 Friday 26 Q3 0.75
6 14 9 387 2675.7 1103.8 Saturday 26 Q3 0.50
回答1:
For an answer using dplyr
and using your %notin%
approach, you also have:
library(dplyr)
dates <-
as.POSIXct(c("2013-12-31", "2013-07-04", "2014-07-04", "2013-11-28", "2013-11-29",
"2013-12-24", "2013-12-25", "2014-02-14", "2014-04-20", "2014-05-26"))
`%notin%` <- function(x,y) !(x %in% y)
modelset %>%
filter(date %notin% dates)
回答2:
Use the which statement like so:
dat <- as.POSIXct(c("2013-12-31", "2013-07-04", "2014-07-04",
"2013-11-28", "2013-11-29", "2013-12-24", "2013-12-25", "2014-02-14",
"2014-04-20", "2014-05-26"))
dat[which(dat != as.POSIXct(c("2013-12-31", "2014-07-04")))]
In your case, I believe it would be:
modelset <- modelset[which(!modelset$date %in% c("2013-12-31", "2013-07-04", "2014-07-04",
"2013-11-28", "2013-11-29", "2013-12-24", "2013-12-25", "2014-02-14",
"2014-04-20", "2014-05-26"))]
What the which statement does is return row numbers where it's evaluated to be true. Then having it inside the brackets, it specifies those row numbers as the only ones to show.
来源:https://stackoverflow.com/questions/27067637/removing-rows-containing-specific-dates-in-r