问题
I have an xts object that covers 169 days of high frequency 5 minute regular observations, but on some of the days there are missing observations, i.e less than 288 data points. How do I remove these so to have only days with full data points?
find days in data
ddx = endpoints(dxts, on="days");
days = format(index(dxts)[ddx], "%Y-%m-%d");
for (day in days) {
x = dxts[day];
cat('', day, "has", length(x), "records...\n");
}
I tried
RTAQ::exchangeHoursOnly(dxts, daybegin = "00:00:00", dayend = "23:55:00")
but this still returned the full set
Thanks
回答1:
Split by days. Count the number of rows of each day, and only keep the ones that have more than 288 rows.
dxts <- .xts(rnorm(1000), 1:1000*5*60)
daylist <- lapply(split(dxts, "days"), function(x) {
if(NROW(x) >= 288) x
})
do.call(rbind, daylist)
The above splits dxts
by "days". Then, if the number of rows is greater than 288, it returns all the data for that day, otherwise, it returns NULL
. So, daylist
will be a list. It will have elements that are either an xts
object, or NULL
. The do.call
part will call rbind
on the list. It's like calling rbind(daylist[[1]], daylist[[2]], ..., daylist[[n]])
The NULL
s won't be aggregated, so you'll be left with a single xts object that omits days with less than 288 rows.
来源:https://stackoverflow.com/questions/11051819/removing-dates-with-less-than-full-observations