ValueError: Data is not binary and pos_label is not specified

后端 未结 2 1589
闹比i
闹比i 2021-01-07 18:57

I am trying to calculate roc_auc_score, but I am getting following error.

\"ValueError: Data is not binary and pos_label is not specified\"


        
相关标签:
2条回答
  • 2021-01-07 19:03

    We have problem in y_true=np.array(['0', '1', '0', '0', '1', '1', '1', '1', '1']) Convert values of y_true to Boolean

    y_true= '1' <= y_true
    print(y_true) # [False  True False False  True  True  True  True  True]
    
    0 讨论(0)
  • 2021-01-07 19:26

    You only need to change y_trueso it looks like this:

    y_true=np.array([0, 1, 0, 0, 1, 1, 1, 1, 1])
    

    Explanation: If you take a look to what roc_auc_score functions does in https://github.com/scikit-learn/scikit-learn/blob/0.15.X/sklearn/metrics/metrics.py you will see that y_true is evaluated as follows:

    classes = np.unique(y_true)
    if (pos_label is None and not (np.all(classes == [0, 1]) or
     np.all(classes == [-1, 1]) or
     np.all(classes == [0]) or
     np.all(classes == [-1]) or
     np.all(classes == [1]))):
        raise ValueError("Data is not binary and pos_label is not specified")
    

    At the moment of the execution pos_label is None, but as long as your are defining y_true as an array of characters the np.all are always false and as all of them are negated then the if condition is trueand the exception is raised.

    0 讨论(0)
提交回复
热议问题