问题
I’m fairly new to R and it would be great if you could help out with this problem as i havent been able to find any answers to this problem online. This is part of my data frame (DF) (it goes on until 2008 in this format)
Counter Date Hour counts
1245 26/05/2006 0 1
1245 26/05/2006 100 0
1245 26/05/2006 200 2
1245 26/05/2006 300 0
1245 26/05/2006 400 5
1245 26/05/2006 500 3
1245 26/05/2006 600 9
1245 26/05/2006 700 10
1245 26/05/2006 800 15
This is my question: I need to subset my code so that between the hours of 600 and 2200 if there are counts over 0 then I need to keep the whole day (000 to 2300) in the data set, but if there are no counts in the specified time period (600 to 2200) then the whole day needs to be deleted. How can I do this?
I tried to do this with the following piece of code, although it takes ONLY the counts data between 600 and 2200 hours and i can't figure out how to make it take the whole day.
DF2=DF[(DF$hour>=600)&(DF$hour<=2200)&(DF$counts>0),] ##16hr worth of counts from 600 to 2200
I’m then subsetting the data where hourly counts are aggregated into daily counts using the following code
daily=subset(DF2)
daily$date = as.Date(daily$date, "%m/%d/%Y")
agg=aggregate(counts~ date, daily, sum)
town=merge(agg,DF2$counter,all=TRUE)
Thank you so much for your help in advance, Katie
回答1:
Try this:
TDF <- subset(DF, hour>=600 & hour<=2200)
# get dates where there at least one hour with count data in range
dates <- subset(aggregate(counts~Date,TDF,sum),counts>0)$Date
# get dates where there are no hours with zero count
dates2 <- subset(aggregate(counts~Date,TDF,prod),counts>0)$Date
DF2 <- subset(DF,Date %in% dates)
DF3 <- subset(DF,Date %in% dates2)
回答2:
plyr is your friend :)
install.packages(plyr)
library(plyr)
ddply(DF, .(Date), function(day) {
if (sum(day$hour >=600 & day$hour <= 2200) > 0) day
else subset(day, hour == -1)
})
ddply
will group entries in DF
by Date
, then for every group, if there is an entry with hour between 6000 and 2200, return that day; otherwise return an empty data frame. ddply
will then combine all groups into a resulting data frame.
来源:https://stackoverflow.com/questions/6582189/subset-data-for-a-day-if-data-between-two-hours-of-the-day-meets-criteria