Increment count over rows with conditional restarting

前端 未结 2 588
南笙
南笙 2020-12-21 12:18

I would like to increment a count that restarts from 1 when a condition in an existing column is met.

For example I have the following data frame:

df         


        
相关标签:
2条回答
  • 2020-12-21 12:41

    Using base R:

    df$x3 <- with(df, ave(x1, cumsum(x2 == 'start'), FUN = seq_along))
    

    gives:

    > df
       x1    x2 x3
    1  10 start  1
    2 100     a  2
    3 200     b  3
    4 300     c  4
    5  87 start  1
    6  90     k  2
    7  45     l  3
    8  80     o  4
    

    Or with the dplyr or data.table packages:

    library(dplyr)
    df %>% 
      group_by(grp = cumsum(x2 == 'start')) %>% 
      mutate(x3 = row_number())
    
    library(data.table)
    # option 1
    setDT(df)[, x3 := rowid(cumsum(x2 == 'start'))][]
    # option 2
    setDT(df)[, x3 := 1:.N, by = cumsum(x2 == 'start')][]
    
    0 讨论(0)
  • 2020-12-21 12:54

    Here is another base R method:

    df$x3 <- sequence(diff(c(which(df$x2 == "start"), nrow(df)+1)))
    

    which returns

    df
       x1    x2 x3
    1  10 start  1
    2 100     a  2
    3 200     b  3
    4 300     c  4
    5  87 start  1
    6  90     k  2
    7  45     l  3
    8  80     o  4
    

    sequence takes an integer vector and returns counts from 1 to each of the vector entries. It is fed the length of each count using diff to calculate the difference of the positions of the start of each sequence. Because of this, we have to include the value of the position after the final row of the data.frame, nrow(df)+1.

    0 讨论(0)
提交回复
热议问题