diff operation within a group, after a dplyr::group_by()

前端 未结 1 1800
独厮守ぢ
独厮守ぢ 2020-12-31 03:11

Let\'s say I have this data.frame (with 3 variables)

ID  Period  Score
123 2013    146
123 2014    133
23  2013    150
456 2013    205
456 2014    219
456 20         


        
1条回答
  •  小鲜肉
    小鲜肉 (楼主)
    2020-12-31 03:25

    Here is another solution using lag. Depending on the use case it might be more convenient than diff because the NAs clearly show that a particular value did not have predecessor whereas a 0 using diff might be the result of a) a missing predecessor or of b) the subtraction between two periods.

    data %>% group_by(ID) %>% filter(n() > 1) %>%
      mutate(
        Difference = Score - lag(Score)
        )
    
    #   ID Period Score Difference
    # 1 123   2013   146         NA
    # 2 123   2014   133        -13
    # 3 456   2013   205         NA
    # 4 456   2014   219         14
    # 5 456   2015   140        -79
    # 6  78   2012   192         NA
    # 7  78   2013   199          7
    # 8  78   2014   133        -66
    # 9  78   2015   170         37
    

    0 讨论(0)
提交回复
热议问题