k-fold cross validation - how to get the prediction automatically?

前端 未结 2 1969
情话喂你
情话喂你 2021-02-03 15:53

This may be a silly question but I just can\'t find a package to do that...I know I can write some codes to get what I want but it would be nice to have a function to do it auto

相关标签:
2条回答
  • 2021-02-03 16:29

    As stated in the comments, caret makes cross-validation very easy. Just use the "glm" method, like so:

    > library(caret)
    > set.seed(2)
    > dat <- data.frame(label=round(runif(100,0,5)),v1=rnorm(100),v2=rnorm(100))
    > tc <- trainControl("cv",10,savePred=T)
    > (fit <- train(label~.,data=dat,method="glm",trControl=tc,family=poisson(link = "log")))
    100 samples
      2 predictors
    
    No pre-processing
    Resampling: Cross-Validation (10 fold) 
    
    Summary of sample sizes: 90, 91, 91, 90, 90, 89, ... 
    
    Resampling results
    
      RMSE  Rsquared  RMSE SD  Rsquared SD
      1.53  0.146     0.131    0.235      
    
    
    > fit$finalModel$family
    
    Family: poisson 
    Link function: log 
    
    > head(fit$pred)
          pred obs rowIndex .parameter Resample
    1 2.684367   1        1       none   Fold01
    2 2.165246   1       18       none   Fold01
    3 2.716165   3       35       none   Fold01
    4 2.514789   3       36       none   Fold01
    5 2.249137   5       47       none   Fold01
    6 2.328514   2       48       none   Fold01
    
    0 讨论(0)
  • 2021-02-03 16:38

    I would suggest investigating cv.glm from package boot, because you are working with a glm model. Another option would be package cvTools. I've found it more useful to write up my own function for CV, though. It sounds like you want a CV function that ends halfway, and most CV functions I've seen will average the prediction error over all the validation sets and return just the average (which, of course, is the definition of cross validation).

    0 讨论(0)
提交回复
热议问题