Obtaining threshold values from a ROC curve

后端 未结 3 798
庸人自扰
庸人自扰 2020-12-02 08:17

I have some models, using ROCR package on a vector of the predicted class percentages, I have a performance object. Plotting the performance object with the spe

相关标签:
3条回答
  • 2020-12-02 08:52

    2 solutions based on the ROCR and pROC packages:

    threshold1 <- function(predict, response) {
        perf <- ROCR::performance(ROCR::prediction(predict, response), "sens", "spec")
        df <- data.frame(cut = perf@alpha.values[[1]], sens = perf@x.values[[1]], spec = perf@y.values[[1]])
        df[which.max(df$sens + df$spec), "cut"]
    }
    threshold2 <- function(predict, response) {
        r <- pROC::roc(response, predict)
        r$thresholds[which.max(r$sensitivities + r$specificities)]
    }
    data(ROCR.simple, package = "ROCR")
    threshold1(ROCR.simple$predictions, ROCR.simple$labels)
    #> [1] 0.5014893
    threshold2(ROCR.simple$predictions, ROCR.simple$labels)
    #> [1] 0.5006387
    

    See also OptimalCutpoints package which provides many algorithms to find an optimal thresholds.

    0 讨论(0)
  • 2020-12-02 08:58

    This is why str is my favorite R function:

    library(ROCR)
    data(ROCR.simple)
    pred <- prediction( ROCR.simple$predictions, ROCR.simple$labels)
    perf <- performance(pred,"tpr","fpr")
    plot(perf)
    > str(perf)
    Formal class 'performance' [package "ROCR"] with 6 slots
      ..@ x.name      : chr "False positive rate"
      ..@ y.name      : chr "True positive rate"
      ..@ alpha.name  : chr "Cutoff"
      ..@ x.values    :List of 1
      .. ..$ : num [1:201] 0 0 0 0 0.00935 ...
          ..@ y.values    :List of 1
          .. ..$ : num [1:201] 0 0.0108 0.0215 0.0323 0.0323 ...
      ..@ alpha.values:List of 1
      .. ..$ : num [1:201] Inf 0.991 0.985 0.985 0.983 ...
    

    Ahah! It's an S4 class, so we can use @ to access the slots. Here's how you make a data.frame:

    cutoffs <- data.frame(cut=perf@alpha.values[[1]], fpr=perf@x.values[[1]], 
                          tpr=perf@y.values[[1]])
    > head(cutoffs)
            cut         fpr        tpr
    1       Inf 0.000000000 0.00000000
    2 0.9910964 0.000000000 0.01075269
    3 0.9846673 0.000000000 0.02150538
    4 0.9845992 0.000000000 0.03225806
    5 0.9834944 0.009345794 0.03225806
    6 0.9706413 0.009345794 0.04301075
    

    If you have an fpr threshold you want to hit, you can subset this data.frame to find maximum tpr below this fpr threshold:

    cutoffs <- cutoffs[order(cutoffs$tpr, decreasing=TRUE),]
    > head(subset(cutoffs, fpr < 0.2))
              cut       fpr       tpr
    96  0.5014893 0.1495327 0.8494624
    97  0.4997881 0.1588785 0.8494624
    98  0.4965132 0.1682243 0.8494624
    99  0.4925969 0.1775701 0.8494624
    100 0.4917356 0.1869159 0.8494624
    101 0.4901199 0.1962617 0.8494624
    
    0 讨论(0)
  • 2020-12-02 09:01

    Package pROC includes function coords for calculating best threshold:

    library(pROC)
    my_roc <- roc(my_response, my_predictor)
    coords(my_roc, "best", ret = "threshold")
    
    0 讨论(0)
提交回复
热议问题