How to perform random forest/cross validation in R

后端 未结 2 1758
误落风尘
误落风尘 2021-01-31 05:08

I\'m unable to find a way of performing cross validation on a regression random forest model that I\'m trying to produce.

So I have a dataset containing 1664 explanatory

相关标签:
2条回答
  • 2021-01-31 05:40

    As topchef pointed out, cross-validation isn't necessary as a guard against over-fitting. This is a nice feature of the random forest algorithm.

    It sounds like your goal is feature selection, cross-validation is still useful for this purpose. Take a look at the rfcv() function within the randomForest package. Documentation specifies input of a data frame & vector, so I'll start by creating those with your data.

    set.seed(42)
    x <- cadets
    x$RT..seconds. <- NULL
    y <- cadets$RT..seconds.
    
    rf.cv <- rfcv(x, y, cv.fold=10)
    
    with(rf.cv, plot(n.var, error.cv))
    
    0 讨论(0)
  • 2021-01-31 05:53

    From the source:

    The out-of-bag (oob) error estimate

    In random forests, there is no need for cross-validation or a separate test set to get an unbiased estimate of the test set error. It is estimated internally , during the run...

    In particular, predict.randomForest returns the out-of-bag prediction if newdata is not given.

    0 讨论(0)
提交回复
热议问题