Recursive regression in R

后端 未结 3 1235
清酒与你
清酒与你 2021-02-10 05:41

Say I have a data frame in R as follows:

> set.seed(1)
> X <- runif(50, 0, 1)
> Y <- runif(50, 0, 1)
> df <- data.frame(X,Y)
> head(df)

         


        
3条回答
  •  清歌不尽
    2021-02-10 06:09

    You may be interested in the biglm function in the biglm package. This allows you to fit a regression on a subset of the data, then update the regression model with additional data. The original idea was to use this for large datasets so that you only need part of the data in memory at any given time, but it fits the description of what you want to do perfectly (you can wrap the updating process in a loop). The summary for biglm objects gives confidence intervals in addition to standard errors (and coefficients of course).

    library(biglm)
    
    fit1 <- biglm( Sepal.Width ~ Sepal.Length + Species, data=iris[1:20,])
    summary(fit1)
    
    out <- list()
    out[[1]] <- fit1
    
    for(i in 1:130) {
      out[[i+1]] <- update(out[[i]], iris[i+20,])
    }
    
    out2 <- lapply(out, function(x) summary(x)$mat)
    out3 <- sapply(out2, function(x) x[2,2:3])
    matplot(t(out3), type='l')
    

    If you don't want to use an explicit loop, then the Reduce function can help:

    fit1 <- biglm( Sepal.Width ~ Sepal.Length + Species, data=iris[1:20,])
    iris.split <- split(iris, c(rep(NA,20),1:130))
    out4 <- Reduce(update, iris.split, init=fit1, accumulate=TRUE)
    out5 <- lapply(out4, function(x) summary(x)$mat)
    out6 <- sapply(out5, function(x) x[2,2:3])
    
    all.equal(out3,out6)
    

提交回复
热议问题