R: rollapplyr and lm factor error: Does rollapplyr change variable class?

狂风中的少年 提交于 2019-12-08 03:40:07

问题


This question builds upon a previous one which was nicely answered for me here.

R: Grouped rolling window linear regression with rollapply and ddply

Wouldn't you know that the code doesn't quite work when extended to the real data rather than the example data?

I have a somewhat large dataset with the following characteristics.

str(T0_satData_reduced)
'data.frame':   45537 obs. of  5 variables:
 $ date   : POSIXct, format: "2014-11-17 08:47:35" "2014-11-17 08:47:36" "2014-11-17 08:47:37" ...
 $ trial  : Factor w/ 5 levels "1","2","3","4",..: 1 1 1 1 1 1 1 1 1 1 ...
 $ vial   : Factor w/ 4 levels "1","2","3","4": 1 1 1 1 1 1 1 1 1 1 ...
 $ O2sat  : num  95.1 95.1 95.1 95.1 95 95.1 95.1 95.2 95.1 95 ...
 $ elapsed: num  20 20 20.1 20.1 20.1 ...

The previous question dealt with the desire to apply a rolling regression of O2sat as a function of elapsed, but grouping the regressions by the factors trial and vial.

The following code is drawn from the answer to my previous question (simply modified for the complete dataset rather than the practice one)

rolled <- function(df) {
   rollapplyr(df, width = 600, function(m) { 
   coef(lm(formula = O2sat ~ elapsed, data = as.data.frame(m)))
   }, by = 60, by.column = FALSE)
 }

T0_slopes <- ddply(T0_satData_reduced, .(trial,vial), function(d) rolled(d))

However, when I run this code I get a series of errors or warnings (first two here).

Warning messages:
1: In model.response(mf, "numeric") :
using type = "numeric" with a factor response will be ignored
2: In Ops.factor(y, z$residuals) : - not meaningful for factors

I'm not sure where this error comes from as I have shown both elapsed and O2sat are numeric, so I am not regressing on factors. However, if I force them both to be numeric within the rolled function above like this.

...
coef(lm(formula = as.numeric(O2sat) ~ as.numeric(elapsed), data = as.data.frame(m)))
...

I no longer get the errors, however, I don't know why this would solve the error. Additionally, the resulting regressions appear suspect because the intercept terms seem inappropriately small.

Any thoughts on why I am getting these errors and why using as.numeric seems to eliminate the errors (if potentially still providing inappropriate regression terms)?

Thank you


回答1:


rollapply passes a matrix to the function so only pass the numeric columns. Using rolled from my prior answer and the setup in that question:

do.call("rbind", by(dat[c("x", "y")], dat[c("w", "z")], rolled))

Added

Another way to do it is to perform the rollapply over the row indexes instead of over the data frame itself. In this example we have also added the conditioning variables as extra output columns:

rolli <- function(ix) {
   data.frame(coef = rollapplyr(ix, width = 6, function(ix) { 
         coef(lm(y ~ x, data = dat, subset = ix))[2]
      }, by = 3), w = dat$w[ix][1], z = dat$z[ix][1])
}
do.call("rbind", by(1:nrow(dat), dat[c("w", "z")], rolli))


来源:https://stackoverflow.com/questions/28332566/r-rollapplyr-and-lm-factor-error-does-rollapplyr-change-variable-class

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!