问题
I am having trouble doing something fairly simple: apply a rolling function (standard deviation) by group in a data.table. My problem is that when I use a data.table with rollapply by some column, data.table recycles the observations as noted in the warning message below. I would like to get NAs for the observations that are outside of the window instead of recycling the standard deviations.
This is my approach so far using iris, and a rolling window of size 2, aligned to the right:
library(zoo)
library(data.table)
A <- iris
setDT(A)
A[,stdev := rollapply(Petal.Width, width = 2, sd, align = 'right', partial = F),by = Species]
Warning messages:
1: In `[.data.table`(A, , `:=`(stdeev, rollapply(Petal.Width, width = 2, :
Supplied 49 items to be assigned to group 1 of size 50 in column 'stdeev' (recycled leaving remainder of 1 items).
2: In `[.data.table`(A, , `:=`(stdeev, rollapply(Petal.Width, width = 2, :
Supplied 49 items to be assigned to group 2 of size 50 in column 'stdeev' (recycled leaving remainder of 1 items).
3: In `[.data.table`(A, , `:=`(stdeev, rollapply(Petal.Width, width = 2, :
Supplied 49 items to be assigned to group 3 of size 50 in column 'stdeev' (recycled leaving remainder of 1 items).
> A
Sepal.Length Sepal.Width Petal.Length Petal.Width Species stdeev stdev
1: 5.1 3.5 1.4 0.2 setosa 0.00000000 0.00000000
2: 4.9 3.0 1.4 0.2 setosa 0.00000000 0.00000000
3: 4.7 3.2 1.3 0.2 setosa 0.00000000 0.00000000
4: 4.6 3.1 1.5 0.2 setosa 0.00000000 0.00000000
5: 5.0 3.6 1.4 0.2 setosa 0.14142136 0.14142136
---
146: 6.7 3.0 5.2 2.3 virginica 0.28284271 0.28284271
147: 6.3 2.5 5.0 1.9 virginica 0.07071068 0.07071068
148: 6.5 3.0 5.2 2.0 virginica 0.21213203 0.21213203
149: 6.2 3.4 5.4 2.3 virginica 0.35355339 0.35355339
150: 5.9 3.0 5.1 1.8 virginica 0.42426407 0.42426407
回答1:
Add fill=NA
to rollapply
. This will ensure that a vector of length 50 (rather than 49) is returned, with NA
as the first value (since align="right"
), avoiding recycling.
A[,stdev := rollapply(Petal.Width, width=2, sd, align='right', partial=F, fill=NA), by=Species]
Sepal.Length Sepal.Width Petal.Length Petal.Width Species stdev 1 5.1 3.5 1.4 0.2 setosa NA 2 4.9 3.0 1.4 0.2 setosa 0.00000000 3 4.7 3.2 1.3 0.2 setosa 0.00000000 ... 51 7.0 3.2 4.7 1.4 versicolor NA 52 6.4 3.2 4.5 1.5 versicolor 0.07071068 53 6.9 3.1 4.9 1.5 versicolor 0.00000000 ... 101 6.3 3.3 6.0 2.5 virginica NA 102 5.8 2.7 5.1 1.9 virginica 0.42426407 103 7.1 3.0 5.9 2.1 virginica 0.14142136
来源:https://stackoverflow.com/questions/43107071/apply-a-rolling-function-by-group-in-r-zoo-data-table