get sum of consecutive day values

后端 未结 2 1189
醉话见心
醉话见心 2021-01-13 18:24

I have large dataset as follows:

Date       rain code
2009-04-01  0.0 0 
2009-04-02  0.0 0 
2009-04-03  0.0 0 
2009-04-04  0.7 1 
2009-04-05 54.2 1  
2009-04         


        
2条回答
  •  -上瘾入骨i
    2021-01-13 18:33

    One straightforward solution is to use rle. But I suspect there might be more "elegant" solutions out there.

    # assuming dd is your data.frame
    dd.rle <- rle(dd$code)
    # get start pos of each consecutive 1's
    start  <- (cumsum(dd.rle$lengths) - dd.rle$lengths + 1)[dd.rle$values == 1]
    # how long do each 1's extend?
    ival   <- dd.rle$lengths[dd.rle$values == 1]
    # using these two, compute the sum
    apply(as.matrix(seq_along(start)), 1, function(idx) {
        sum(dd$rain[start[idx]:(start[idx]+ival[idx]-1)])
    })
    
    # [1]  54.9  30.1 111.9
    

    Edit: An even simpler method with rle and tapply.

    dd.rle <- rle(dd$code)
    # get the length of each consecutive 1's
    ival <- dd.rle$lengths[dd.rle$values == 1]
    # using lengths, construct a `factor` with levels = length(ival)
    levl  <- factor(rep(seq_along(ival), ival))
    # use these levels to extract `rain[code == 1]` and compute sum
    tapply(dd$rain[dd$code == 1], levl, sum)
    
    #    1     2     3 
    # 54.9  30.1 111.9 
    

提交回复
热议问题