问题
I am trying to create a bunch of lagged variables all at once in data.table. I want these lagged values to be by station and by landcover. I am having some difficulty. Here is my example data.table.
require(data.table)
r <- structure(list(station = structure(c(1L, 1L, 1L, 1L, 1L, 1L,
2L, 2L, 2L, 2L, 2L, 2L), .Label = c("A", "B"), class = "factor"),
landcover = structure(c(2L, 2L, 2L, 4L, 4L, 4L, 1L, 1L, 1L,
3L, 3L, 3L), .Label = c("foam", "Mixed Forest", "other2",
"Sand"), class = "factor"), cv = c(0.273287412020818, 0.453346217936644,
0.235088531585817, 0.703112865400233, 0.221907230708271,
0.278459655651048, 0.376646346809308, 0.662970017835398,
0.296458678818467, 0.390335320625924, 0.712476246695341,
0.535612484651002)), .Names = c("station", "landcover", "cv"
), row.names = c(NA, -12L), class = c("data.table", "data.frame"
))
# station landcover cv
# 1: A Mixed Forest 0.2732874
# 2: A Mixed Forest 0.4533462
# 3: A Mixed Forest 0.2350885
# 4: A Sand 0.7031129
# 5: A Sand 0.2219072
# 6: A Sand 0.2784597
# 7: B foam 0.3766463
# 8: B foam 0.6629700
# 9: B foam 0.2964587
# 10: B other2 0.3903353
# 11: B other2 0.7124762
# 12: B other2 0.5356125
I want to create a bunch of lagged variables. I am not even concerned about the NA values that will result at this point. How do I create a data.table that looks like the one below without writing so much code? I need this to still be in data.table.
r[, cv.lag1 := c(rep(NA,1), head(cv, -1)),by=c("station","landcover")]
r[, cv.lag2 := c(rep(NA,2), head(cv, -2)),by=c("station","landcover")]
r[, cv.lag3 := c(rep(NA,3), head(cv, -3)),by=c("station","landcover")]
r[, cv.lag4 := c(rep(NA,4), head(cv, -4)),by=c("station","landcover")]
r[, cv.lag5 := c(rep(NA,5), head(cv, -5)),by=c("station","landcover")]
r[, cv.lag6 := c(rep(NA,6), head(cv, -6)),by=c("station","landcover")]
r[, cv.lag7 := c(rep(NA,7), head(cv, -7)),by=c("station","landcover")]
r[, cv.lag8 := c(rep(NA,8), head(cv, -8)),by=c("station","landcover")]
r[, cv.lag9 := c(rep(NA,9), head(cv, -9)),by=c("station","landcover")]
r[, cv.lag10 := c(rep(NA,10), head(cv, -10)),by=c("station","landcover")]
station landcover cv cv.lag1 cv.lag2 cv.lag3 cv.lag4 cv.lag5 cv.lag6 cv.lag7 cv.lag8 cv.lag9 cv.lag10
1: A Mixed Forest 0.2732874 NA NA NA NA NA NA NA NA NA NA
2: A Mixed Forest 0.4533462 0.2732874 NA NA NA NA NA NA NA NA NA
3: A Mixed Forest 0.2350885 0.4533462 0.2732874 NA NA NA NA NA NA NA NA
4: A Sand 0.7031129 NA NA NA NA NA NA NA NA NA NA
5: A Sand 0.2219072 0.7031129 NA NA NA NA NA NA NA NA NA
6: A Sand 0.2784597 0.2219072 0.7031129 NA NA NA NA NA NA NA NA
7: B foam 0.3766463 NA NA NA NA NA NA NA NA NA NA
8: B foam 0.6629700 0.3766463 NA NA NA NA NA NA NA NA NA
9: B foam 0.2964587 0.6629700 0.3766463 NA NA NA NA NA NA NA NA
10: B other2 0.3903353 NA NA NA NA NA NA NA NA NA NA
11: B other2 0.7124762 0.3903353 NA NA NA NA NA NA NA NA NA
12: B other2 0.5356125 0.7124762 0.3903353 NA NA NA NA NA NA NA NA
回答1:
Thanks to Arun for providing the answer in an elegant one line solution.
r[, c(paste("cv.lag", 1:10, sep="")) := lapply(1:10, function(i) c(rep(NA, i), head(cv, -i))), by=list(station,landcover)]
station landcover cv cv.lag1 cv.lag2 cv.lag3 cv.lag4 cv.lag5 cv.lag6 cv.lag7 cv.lag8 cv.lag9 cv.lag10
1: A Mixed Forest 0.2732874 NA NA NA NA NA NA NA NA NA NA
2: A Mixed Forest 0.4533462 0.2732874 NA NA NA NA NA NA NA NA NA
3: A Mixed Forest 0.2350885 0.4533462 0.2732874 NA NA NA NA NA NA NA NA
4: A Sand 0.7031129 NA NA NA NA NA NA NA NA NA NA
5: A Sand 0.2219072 0.7031129 NA NA NA NA NA NA NA NA NA
6: A Sand 0.2784597 0.2219072 0.7031129 NA NA NA NA NA NA NA NA
7: B foam 0.3766463 NA NA NA NA NA NA NA NA NA NA
8: B foam 0.6629700 0.3766463 NA NA NA NA NA NA NA NA NA
9: B foam 0.2964587 0.6629700 0.3766463 NA NA NA NA NA NA NA NA
10: B other2 0.3903353 NA NA NA NA NA NA NA NA NA NA
11: B other2 0.7124762 0.3903353 NA NA NA NA NA NA NA NA NA
12: B other2 0.5356125 0.7124762 0.3903353 NA NA NA NA NA NA NA NA
来源:https://stackoverflow.com/questions/23420307/creating-a-bunch-of-lagged-variables-in-data-table-at-once