I\'m unable to find a way of performing cross validation on a regression random forest model that I\'m trying to produce.
So I have a dataset containing 1664 explanatory
As topchef pointed out, cross-validation isn't necessary as a guard against over-fitting. This is a nice feature of the random forest algorithm.
It sounds like your goal is feature selection, cross-validation is still useful for this purpose. Take a look at the rfcv()
function within the randomForest package. Documentation specifies input of a data frame & vector, so I'll start by creating those with your data.
set.seed(42)
x <- cadets
x$RT..seconds. <- NULL
y <- cadets$RT..seconds.
rf.cv <- rfcv(x, y, cv.fold=10)
with(rf.cv, plot(n.var, error.cv))
From the source:
The out-of-bag (oob) error estimate
In random forests, there is no need for cross-validation or a separate test set to get an unbiased estimate of the test set error. It is estimated internally , during the run...
In particular, predict.randomForest
returns the out-of-bag prediction if newdata
is not given.