I have an excel csv with a date/time column and a value associated with that date/time. I\'m trying to write a script that will go through this format (see below), and find 1) t
If you are dealing with time series data, I suggest you use a time series class like zoo
or xts
dat <- read.table(text=" V1 V2 V3
1 5/1/2012 3:00 1
2 5/1/2012 6:00 2
3 5/1/2012 9:00 5
4 5/1/2012 12:00 3
5 5/1/2012 15:00 6
6 5/1/2012 18:00 2
7 5/1/2012 21:00 1
8 5/2/2012 0:00 2
9 5/2/2012 3:00 3
10 5/2/2012 6:00 6
11 5/2/2012 9:00 4
12 5/2/2012 12:00 6
13 5/2/2012 15:00 7
14 5/2/2012 18:00 9
15 5/2/2012 21:00 1", row.names=1, header=TRUE)
require("xts")
# create an xts object
xobj <- xts(dat[, 3], order.by=as.POSIXct(paste(dat[, 1], dat[, 2]), format="%m/%d/%Y %H:%M"))
If you just wanted to get the daily maximums, and you were okay with using the last time of the day as the index, you could use apply.daily
apply.daily(xobj, max)
# [,1]
#2012-05-01 21:00:00 6
#2012-05-02 21:00:00 9
To keep the timestamps at which it occurs, you could do this
do.call(rbind, lapply(split(xobj, "days"), function(x) x[which.max(x), ]))
# [,1]
2012-05-01 15:00:00 6
2012-05-02 18:00:00 9
split(xobj, "days")
creates a list with one day's data in each element.
lapply
applies a function to each day; the function, in this case, simply returns the max
observation for each day. The lapply
call will return a list
of xts objects. To turn it back into
a single xts object, use do.call
.
do.call(rbind, X)
constructs a call to rbind using each element of the list. It is equivalent to rbind(X[[1]], X[[2]], ..., X[[n]])