Creating a bunch of lagged variables in data.table at once

佐手、 提交于 2019-12-21 06:55:40

问题


I am trying to create a bunch of lagged variables all at once in data.table. I want these lagged values to be by station and by landcover. I am having some difficulty. Here is my example data.table.

require(data.table)
    r <- structure(list(station = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 
2L, 2L, 2L, 2L, 2L, 2L), .Label = c("A", "B"), class = "factor"), 
    landcover = structure(c(2L, 2L, 2L, 4L, 4L, 4L, 1L, 1L, 1L, 
    3L, 3L, 3L), .Label = c("foam", "Mixed Forest", "other2", 
    "Sand"), class = "factor"), cv = c(0.273287412020818, 0.453346217936644, 
    0.235088531585817, 0.703112865400233, 0.221907230708271, 
    0.278459655651048, 0.376646346809308, 0.662970017835398, 
    0.296458678818467, 0.390335320625924, 0.712476246695341, 
    0.535612484651002)), .Names = c("station", "landcover", "cv"
), row.names = c(NA, -12L), class = c("data.table", "data.frame"
))

# station    landcover        cv
# 1:       A Mixed Forest 0.2732874
# 2:       A Mixed Forest 0.4533462
# 3:       A Mixed Forest 0.2350885
# 4:       A         Sand 0.7031129
# 5:       A         Sand 0.2219072
# 6:       A         Sand 0.2784597
# 7:       B         foam 0.3766463
# 8:       B         foam 0.6629700
# 9:       B         foam 0.2964587
# 10:       B       other2 0.3903353
# 11:       B       other2 0.7124762
# 12:       B       other2 0.5356125

I want to create a bunch of lagged variables. I am not even concerned about the NA values that will result at this point. How do I create a data.table that looks like the one below without writing so much code? I need this to still be in data.table.

r[, cv.lag1 :=  c(rep(NA,1), head(cv, -1)),by=c("station","landcover")]
r[, cv.lag2 :=  c(rep(NA,2), head(cv, -2)),by=c("station","landcover")]
r[, cv.lag3 :=  c(rep(NA,3), head(cv, -3)),by=c("station","landcover")]
r[, cv.lag4 :=  c(rep(NA,4), head(cv, -4)),by=c("station","landcover")]
r[, cv.lag5 :=  c(rep(NA,5), head(cv, -5)),by=c("station","landcover")]
r[, cv.lag6 :=  c(rep(NA,6), head(cv, -6)),by=c("station","landcover")]
r[, cv.lag7 :=  c(rep(NA,7), head(cv, -7)),by=c("station","landcover")]
r[, cv.lag8 :=  c(rep(NA,8), head(cv, -8)),by=c("station","landcover")]
r[, cv.lag9 :=  c(rep(NA,9), head(cv, -9)),by=c("station","landcover")]
r[, cv.lag10 := c(rep(NA,10), head(cv, -10)),by=c("station","landcover")]

    station    landcover        cv   cv.lag1   cv.lag2 cv.lag3 cv.lag4 cv.lag5 cv.lag6 cv.lag7 cv.lag8 cv.lag9 cv.lag10
 1:       A Mixed Forest 0.2732874        NA        NA      NA      NA      NA      NA      NA      NA      NA       NA
 2:       A Mixed Forest 0.4533462 0.2732874        NA      NA      NA      NA      NA      NA      NA      NA       NA
 3:       A Mixed Forest 0.2350885 0.4533462 0.2732874      NA      NA      NA      NA      NA      NA      NA       NA
 4:       A         Sand 0.7031129        NA        NA      NA      NA      NA      NA      NA      NA      NA       NA
 5:       A         Sand 0.2219072 0.7031129        NA      NA      NA      NA      NA      NA      NA      NA       NA
 6:       A         Sand 0.2784597 0.2219072 0.7031129      NA      NA      NA      NA      NA      NA      NA       NA
 7:       B         foam 0.3766463        NA        NA      NA      NA      NA      NA      NA      NA      NA       NA
 8:       B         foam 0.6629700 0.3766463        NA      NA      NA      NA      NA      NA      NA      NA       NA
 9:       B         foam 0.2964587 0.6629700 0.3766463      NA      NA      NA      NA      NA      NA      NA       NA
10:       B       other2 0.3903353        NA        NA      NA      NA      NA      NA      NA      NA      NA       NA
11:       B       other2 0.7124762 0.3903353        NA      NA      NA      NA      NA      NA      NA      NA       NA
12:       B       other2 0.5356125 0.7124762 0.3903353      NA      NA      NA      NA      NA      NA      NA       NA

回答1:


Thanks to Arun for providing the answer in an elegant one line solution.

r[, c(paste("cv.lag", 1:10, sep="")) := lapply(1:10, function(i) c(rep(NA, i), head(cv, -i))), by=list(station,landcover)]

    station    landcover        cv   cv.lag1   cv.lag2 cv.lag3 cv.lag4 cv.lag5 cv.lag6 cv.lag7 cv.lag8 cv.lag9 cv.lag10
 1:       A Mixed Forest 0.2732874        NA        NA      NA      NA      NA      NA      NA      NA      NA       NA
 2:       A Mixed Forest 0.4533462 0.2732874        NA      NA      NA      NA      NA      NA      NA      NA       NA
 3:       A Mixed Forest 0.2350885 0.4533462 0.2732874      NA      NA      NA      NA      NA      NA      NA       NA
 4:       A         Sand 0.7031129        NA        NA      NA      NA      NA      NA      NA      NA      NA       NA
 5:       A         Sand 0.2219072 0.7031129        NA      NA      NA      NA      NA      NA      NA      NA       NA
 6:       A         Sand 0.2784597 0.2219072 0.7031129      NA      NA      NA      NA      NA      NA      NA       NA
 7:       B         foam 0.3766463        NA        NA      NA      NA      NA      NA      NA      NA      NA       NA
 8:       B         foam 0.6629700 0.3766463        NA      NA      NA      NA      NA      NA      NA      NA       NA
 9:       B         foam 0.2964587 0.6629700 0.3766463      NA      NA      NA      NA      NA      NA      NA       NA
10:       B       other2 0.3903353        NA        NA      NA      NA      NA      NA      NA      NA      NA       NA
11:       B       other2 0.7124762 0.3903353        NA      NA      NA      NA      NA      NA      NA      NA       NA
12:       B       other2 0.5356125 0.7124762 0.3903353      NA      NA      NA      NA      NA      NA      NA       NA


来源:https://stackoverflow.com/questions/23420307/creating-a-bunch-of-lagged-variables-in-data-table-at-once

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!