I have been trying to apply recursive feature selection using caret package. What I need is that ref uses the AUC as performance measure. After googling for a month I cannot get
One problem is a minor typo ('trControl='
instead of 'trainControl='
). Also, you change caretFuncs
after you attached it to rfe
's control function. Lastly, you will need to tell trainControl
to calculate the ROC curves.
This code works:
caretFuncs$summary <- twoClassSummary
ctrl <- rfeControl(functions=caretFuncs,
method = "cv",
repeats =5, number = 10,
returnResamp="final", verbose = TRUE)
trainctrl <- trainControl(classProbs= TRUE,
summaryFunction = twoClassSummary)
rf.profileROC.Radial <- rfe(mdrrDescr, mdrrClass,
sizes=subsets,
rfeControl=ctrl,
method="svmRadial",
## I also added this line to
## avoid a warning:
metric = "ROC",
trControl = trainctrl)
> rf.profileROC.Radial
Recursive feature selection
Outer resampling method: Cross-Validated (10 fold)
Resampling performance over subset size:
Variables ROC Sens Spec ROCSD SensSD SpecSD Selected
1 0.7805 0.8356 0.6304 0.08139 0.10347 0.10093
2 0.8340 0.8491 0.6609 0.06955 0.10564 0.09787
3 0.8412 0.8491 0.6565 0.07222 0.10564 0.09039
4 0.8465 0.8491 0.6609 0.06581 0.09584 0.10207
5 0.8502 0.8624 0.6652 0.05844 0.08536 0.09404
6 0.8684 0.8923 0.7043 0.06222 0.06893 0.09999
7 0.8642 0.8691 0.6913 0.05655 0.10837 0.06626
8 0.8697 0.8823 0.7043 0.05411 0.08276 0.07333
9 0.8792 0.8753 0.7348 0.05414 0.08933 0.07232 *
10 0.8622 0.8826 0.6696 0.07457 0.08810 0.16550
342 0.8650 0.8926 0.6870 0.07392 0.08140 0.17367
The top 5 variables (out of 9):
nC, X3v, Sp, X2v, X1v
For the prediction problems, you should use rf.profileROC.Radial
instead of the fit
component:
> predict(rf.profileROC.Radial, head(mdrrDescr))
pred Active Inactive
1 Inactive 0.4392768 0.5607232
2 Active 0.6553482 0.3446518
3 Active 0.6387261 0.3612739
4 Inactive 0.3060582 0.6939418
5 Active 0.6661557 0.3338443
6 Active 0.7513180 0.2486820
Max