Feature Selection in caret rfe + sum with ROC

前端 未结 1 948
忘了有多久
忘了有多久 2021-02-06 08:19

I have been trying to apply recursive feature selection using caret package. What I need is that ref uses the AUC as performance measure. After googling for a month I cannot get

1条回答
  •  遥遥无期
    2021-02-06 08:49

    One problem is a minor typo ('trControl=' instead of 'trainControl='). Also, you change caretFuncs after you attached it to rfe's control function. Lastly, you will need to tell trainControl to calculate the ROC curves.

    This code works:

     caretFuncs$summary <- twoClassSummary
    
     ctrl <- rfeControl(functions=caretFuncs, 
                        method = "cv",
                        repeats =5, number = 10,
                        returnResamp="final", verbose = TRUE)
    
     trainctrl <- trainControl(classProbs= TRUE,
                               summaryFunction = twoClassSummary)
     rf.profileROC.Radial <- rfe(mdrrDescr, mdrrClass, 
                                 sizes=subsets,
                                 rfeControl=ctrl,
                                 method="svmRadial",
                                 ## I also added this line to
                                 ## avoid a warning:
                                 metric = "ROC",
                                 trControl = trainctrl)
    
    
     > rf.profileROC.Radial
    
     Recursive feature selection
    
     Outer resampling method: Cross-Validated (10 fold) 
    
     Resampling performance over subset size:
    
      Variables    ROC   Sens   Spec   ROCSD  SensSD  SpecSD Selected
              1 0.7805 0.8356 0.6304 0.08139 0.10347 0.10093         
              2 0.8340 0.8491 0.6609 0.06955 0.10564 0.09787         
              3 0.8412 0.8491 0.6565 0.07222 0.10564 0.09039         
              4 0.8465 0.8491 0.6609 0.06581 0.09584 0.10207         
              5 0.8502 0.8624 0.6652 0.05844 0.08536 0.09404         
              6 0.8684 0.8923 0.7043 0.06222 0.06893 0.09999         
              7 0.8642 0.8691 0.6913 0.05655 0.10837 0.06626         
              8 0.8697 0.8823 0.7043 0.05411 0.08276 0.07333         
              9 0.8792 0.8753 0.7348 0.05414 0.08933 0.07232        *
             10 0.8622 0.8826 0.6696 0.07457 0.08810 0.16550         
            342 0.8650 0.8926 0.6870 0.07392 0.08140 0.17367         
    
     The top 5 variables (out of 9):
        nC, X3v, Sp, X2v, X1v
    

    For the prediction problems, you should use rf.profileROC.Radial instead of the fit component:

     > predict(rf.profileROC.Radial, head(mdrrDescr))
           pred    Active  Inactive
     1 Inactive 0.4392768 0.5607232
     2   Active 0.6553482 0.3446518
     3   Active 0.6387261 0.3612739
     4 Inactive 0.3060582 0.6939418
     5   Active 0.6661557 0.3338443
     6   Active 0.7513180 0.2486820
    

    Max

    0 讨论(0)
提交回复
热议问题