I have large dataset as follows:
Date rain code
2009-04-01 0.0 0
2009-04-02 0.0 0
2009-04-03 0.0 0
2009-04-04 0.7 1
2009-04-05 54.2 1
2009-04
One straightforward solution is to use rle
. But I suspect there might be more "elegant" solutions out there.
# assuming dd is your data.frame
dd.rle <- rle(dd$code)
# get start pos of each consecutive 1's
start <- (cumsum(dd.rle$lengths) - dd.rle$lengths + 1)[dd.rle$values == 1]
# how long do each 1's extend?
ival <- dd.rle$lengths[dd.rle$values == 1]
# using these two, compute the sum
apply(as.matrix(seq_along(start)), 1, function(idx) {
sum(dd$rain[start[idx]:(start[idx]+ival[idx]-1)])
})
# [1] 54.9 30.1 111.9
Edit: An even simpler method with rle
and tapply
.
dd.rle <- rle(dd$code)
# get the length of each consecutive 1's
ival <- dd.rle$lengths[dd.rle$values == 1]
# using lengths, construct a `factor` with levels = length(ival)
levl <- factor(rep(seq_along(ival), ival))
# use these levels to extract `rain[code == 1]` and compute sum
tapply(dd$rain[dd$code == 1], levl, sum)
# 1 2 3
# 54.9 30.1 111.9