问题
I encountered the question: "Cumulative sum that resets when 0 is encountered" via https://stackoverflow.com/a/32502162/13269143 , which partially, but not fully, answered my question. I first wanted to create a column that, row-wise, accumulates the values of each sequence in column b that is separated by a 0. This I achieved by using the code:
setDT(df)[, whatiwant := cumsum(b), by = rleid(b == 0L)]
as suggested in https://stackoverflow.com/a/32502162/13269143 (the other solutions provided did not work for me. They only produced NA values.) Now, I wish to also create a third column, "What I Want" in the illustration, that assigns the maximum aggregated value of the accumulated value for a given sequence to each observation in that particular sequence. Let me illustrate,
b Accumulated What I Want
1 1 3
1 2 3
1 3 3
0 0 0
1 1 4
1 2 4
1 3 4
1 4 4
0 0 0
0 0 0
0 0 0
1 1 2
1 2 2
There might be a very simple way to do this. Thank you in advance.
回答1:
You can use max
instead of cumsum
in your attempt :
library(data.table)
setDT(df)[, whatiwant := max(Accumulated), by = rleid(b == 0L)]
df
# b Accumulated whatiwant
# 1: 1 1 3
# 2: 1 2 3
# 3: 1 3 3
# 4: 0 0 0
# 5: 1 1 4
# 6: 1 2 4
# 7: 1 3 4
# 8: 1 4 4
# 9: 0 0 0
#10: 0 0 0
#11: 0 0 0
#12: 1 1 2
#13: 1 2 2
回答2:
You can use rle
and inverse.rle
like:
b <- c(1,1,1,0,1,1,1,1,0,0,0,1,1)
x <- rle(b)
i <- x$values == 1
x$values[i] <- x$lengths[i]
inverse.rle(x)
# [1] 3 3 3 0 4 4 4 4 0 0 0 2 2
回答3:
You can use the rle()
function to get the run lengths and then mapply()
to turn its return value into the vector you want:
d <- tibble(b=c(1,1,1,0,1,1,1,1,0,0,0,1,1),
WhatIWant=unlist(mapply(rep, rle(b)$lengths, rle(b)$lengths))) %>%
mutate(WhatIWant=ifelse(b == 0, 0, WhatIWant))
Gives
# A tibble: 13 x 2
b WhatIWant
<dbl> <dbl>
1 1 3
2 1 3
3 1 3
4 0 0
5 1 4
6 1 4
7 1 4
8 1 4
9 0 0
10 0 0
11 0 0
12 1 2
13 1 2
来源:https://stackoverflow.com/questions/61912387/count-the-number-of-na-values-in-a-row-reset-when-0