Removing rows containing specific dates in R

你说的曾经没有我的故事 提交于 2021-02-18 19:34:40

问题


Disclaimer: I am going to come out of this looking silly.

I have a data frame containing a column which has a date of class POSIXct. I am trying to remove some of the rows containing specific dates- public holidays. I tried to do that using this:

> modelset.nonholiday <- modelset[!modelset$date == as.POSIXct("2013-12-31")| !modelset$date ==as.POSIXct("2013-07-04") | !modelset$date == as.POSIXct("2014-07-04")| !modelset$date == as.POSIXct ("2013-11-28") | !modelset$date == as.POSIXct ("2013-11-29") | !modelset$date == as.POSIXct ("2013-12-24") | !modelset$date == as.POSIXct ("2013-12-25") | !modelset$date == as.POSIXct ("2014-02-14") | !modelset$date == as.POSIXct ("2014-04-20") | !modelset$date == as.POSIXct ("2014-05-26"), ]

The above didn't work. It returns the data frame removing only the first So I tried :

modelset[!modelset$date %in% c("2013-12-31", "2013-07-04", "2014-07-04",
             "2013-11-28", "2013-11-29", "2013-12-24", "2013-12-25", "2014-02-14", 
             "2014-04-20", "2014-05-26"), ]

This didn't work either. I also tried:

`%notin%` <- function(x,y) !(x %in% y) 

modelset[modelset$date %notin% as.POSIXct(c("2013-12-31", "2013-07-04", "2014-07-04",
                 "2013-11-28", "2013-11-29", "2013-12-24", "2013-12-25", "2014-02-14",
                 "2014-04-20", "2014-05-26")), ]`

I've referred Remove Rows From Data Frame where a Row match a String, R remove rows containing a certain value, and Standard way to remove multiple elements from a dataframe but can't seem to find what I am doing wrong.

> head(modelset)
    date spot.volume.loc spot.volume.nat nat.imp.a loc.imp.a nat.imp.m loc.imp.m branded.leads esi.leads
1 2013-07-01            2988             215     13931    4155.3      5770    1853.7           331       363
2 2013-07-02            3200             218     12589    4651.3      5374    2207.8           293       428
3 2013-07-03            3066             203     10305    3921.0      4754    1759.2           273       325
4 2013-07-04            3153              83      2353    4135.6       999    1912.2           172       184
5 2013-07-05            2959              59      1553    3573.4       815    1662.3           193       246
6 2013-07-06             667              53      2219     456.7       889     214.8           161       203
tv.leads callin.leads total.leads total.imp.a total.imp.m       day week quarter on.off
1      195           41         930     18086.3      7623.7    Monday   26      Q3   1.25
2      192           50         963     17240.3      7581.8   Tuesday   26      Q3   1.00
3      149           38         785     14226.0      6513.2 Wednesday   26      Q3   1.00
4       34            0         390      6488.6      2911.2  Thursday   26      Q3   1.00
5       50           18         507      5126.4      2477.3    Friday   26      Q3   0.75
6       14            9         387      2675.7      1103.8  Saturday   26      Q3   0.50

回答1:


For an answer using dplyr and using your %notin% approach, you also have:

library(dplyr)

dates <- 
  as.POSIXct(c("2013-12-31", "2013-07-04", "2014-07-04", "2013-11-28", "2013-11-29", 
               "2013-12-24", "2013-12-25", "2014-02-14", "2014-04-20", "2014-05-26"))

`%notin%` <- function(x,y) !(x %in% y) 

modelset %>%
  filter(date %notin% dates)



回答2:


Use the which statement like so:

dat <- as.POSIXct(c("2013-12-31", "2013-07-04", "2014-07-04",
                                         "2013-11-28", "2013-11-29", "2013-12-24", "2013-12-25", "2014-02-14", 
                                         "2014-04-20", "2014-05-26"))

dat[which(dat != as.POSIXct(c("2013-12-31", "2014-07-04")))]

In your case, I believe it would be:

modelset <- modelset[which(!modelset$date %in% c("2013-12-31", "2013-07-04", "2014-07-04",
         "2013-11-28", "2013-11-29", "2013-12-24", "2013-12-25", "2014-02-14", 
         "2014-04-20", "2014-05-26"))]

What the which statement does is return row numbers where it's evaluated to be true. Then having it inside the brackets, it specifies those row numbers as the only ones to show.



来源:https://stackoverflow.com/questions/27067637/removing-rows-containing-specific-dates-in-r

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!