问题
I have a data set with several grouping variables on which I want to run a rolling window linear regression. The ultimate goals is to extract the 10 linear regressions with the lowest slopes and average them together to provide a mean minimum rate of change. I have found examples using rollapply to calculate rolling window linear regressions, but I have the added complication that I would like to apply these linear regressions to groups within the data set.
Here is a sample data set and my current code which is close and isn't quite working.
dat<-data.frame(w=c(rep(1,27), rep(2,27),rep(3,27)), z=c(rep(c(1,2,3),27)),
x=c(rep(seq(1,27),3)), y=c(rnorm(27,10,3), rnorm(27,3,2.2), rnorm(27, 6,1.3)))
where w and z are two grouping variables and x and y are the regression terms.
From my internet searches here is aR basic rolling window linear regression code where the window size is 6, sequential regressions are separated by 3 data points and I am extracting only the slope coef(lm...)[2]
library(zoo)
slopeData<-rollapply(zoo(dat), width=6, function(Z) {
coef(lm(formula=y~x, data = as.data.frame(Z), na.rm=T))[2]
}, by = 3, by.column=FALSE, align="right")
Now I wish to apply this rolling window regression to the groups specified by the two grouping variables w and z. So I tried something like this using ddply from plyr package. First I try to rewrite the code above as a function.
rolled<-function(df) {
rollapply(zoo(df), width=6, function(Z) {
coef(lm(formula=y~x, data = as.data.frame(Z), na.rm=T))[2]
}, by = 3, by.column=FALSE, align="right")
}
And then run apply that function using ddply
groupedSlope <- ddply(dat, .(w,z), function(d) rolled(d))
This, however, doesn't work as I get a series of warnings and errors. I imagine that some of the errors may relate to the combining of zoo formats and data frames and this becomes overly complicated. Its what I have been working on so far, but does anyone know of a means of getting grouped, rolling window linear regressions, potentially simpler than this method?
Thanks for any assistance, Nate
回答1:
1) rollapply
works on data frames too so it is not necessary to convert df
to zoo.
2) lm
uses na.action
, not na.rm
, and its default is na.omit
so we can just drop this argument.
3) rollapplyr
is a more concise way to write rollapply(..., align = "right")
.
Assuming that rolled
otherwise does what you want and incorporating these changes into rolled
, the ddply
statement in the question should work or we could use by
from the base of R which we show below:
rolled <- function(df) {
rollapplyr(df, width = 6, function(m) {
coef(lm(formula = y ~ x, data = as.data.frame(m)))[2]
}, by = 3, by.column = FALSE
)
}
do.call("rbind", by(dat, dat[c("w", "z")], rolled))
来源:https://stackoverflow.com/questions/28306604/r-grouped-rolling-window-linear-regression-with-rollapply-and-ddply