R: How to calculate lag for multiple columns by group for data table

前端 未结 1 1997
攒了一身酷
攒了一身酷 2021-01-25 14:52

I would like to calculate the diff of variables in a data table, grouped by id. Here is some sample data. The data is recorded at a sample rate of 1 Hz. I would like to estim

相关标签:
1条回答
  • 2021-01-25 15:29

    You can try

     setnames(dt[, lapply(.SD, function(x) c(NA,diff(x))), by=id], 
                    2:3, c('dx', 'dy'))[]
     #    id dx dy
      #1:  1 NA NA
      #2:  1  1  2
      #3:  1  1  1
      #4:  2 NA NA
      #5:  2  4 -6
      #6:  2  1  1
    

    Another option would be to use dplyr

     library(dplyr)
     df %>% 
         group_by(id) %>%
         mutate_each(funs(c(NA,diff(.))))%>%
         rename(dx=x, dy=y)
    

    Update

    You can repeat the step twice

    dt[, c('dx', 'dy'):=lapply(.SD, function(x) c(NA, diff(x))), by=id]
    dt[,c('dx2', 'dy2'):= lapply(.SD, function(x) c(NA, diff(x))),
                                                by=id, .SDcols=4:5]
     dt
     #   x y id dx dy dx2 dy2
     #1: 1 2  1 NA NA  NA  NA
     #2: 2 4  1  1  2  NA  NA
     #3: 3 5  1  1  1   0  -1
     #4: 1 8  2 NA NA  NA  NA
     #5: 5 2  2  4 -6  NA  NA
     #6: 6 3  2  1  1  -3   7
    

    Or we can use the shift function from data.table

    dt[, paste0("d", c("x", "y")) := .SD - shift(.SD), by = id
      ][, paste0("d", c("x2", "y2")) := .SD - shift(.SD) , by =  id, .SDcols = 4:5 ]
    
    0 讨论(0)
提交回复
热议问题