I\'m trying to create a variable that adds one to its previous value, then starts back at 1 when a different variable changes.
Right now, I\'m trying to use shift an
There are at least two ways of looking at this, which I will demonstrate with the following sample data:
library(data.table)
DT <- data.table(v1 = c(1, 1, 2, 2, 2, 1, 1, 3, 3, 3, 1, 2),
v2 = c(6, 7, 5, 4, 6, 8, 1, 2, 9, 4, 6, 5))
The first is to assume that you want to restart any time there's a change in another variable, even if the value at the time of change has occurred earlier in the set.
If that's the case, you could consider the rleid
function from "data.table". Observe how the counter variable gets reset even for previously occurring values in "v1":
DT[, N := sequence(.N), by = rleid(v1)][]
# v1 v2 N
# 1: 1 6 1
# 2: 1 7 2
# 3: 2 5 1
# 4: 2 4 2
# 5: 2 6 3
# 6: 1 8 1
# 7: 1 1 2
# 8: 3 2 1
# 9: 3 9 2
# 10: 3 4 3
# 11: 1 6 1
# 12: 2 5 1
The second perspective would be to assume that you are looking for a cumulative count just grouped by another variable, whether the values are contiguous or not. Observe how the counter continues for the repeated values in "v1".
DT[, N := sequence(.N), by = v1][]
# v1 v2 N
# 1: 1 6 1
# 2: 1 7 2
# 3: 2 5 1
# 4: 2 4 2
# 5: 2 6 3
# 6: 1 8 3
# 7: 1 1 4
# 8: 3 2 1
# 9: 3 9 2
# 10: 3 4 3
# 11: 1 6 5
# 12: 2 5 4