Different result roc_auc_score and plot_roc_curve

问题

I am training a RandomForestClassifier (sklearn) to predict credit card fraud. When I then test the model and check the rocauc score i get different values when I use roc_auc_score and plot_roc_curve. roc_auc_score gives me around 0.89 and the plot_curve calculates AUC to 0.96 why is that?

The labels are all 0 and 1 as well as the predictions are 0 or 1. CodE:

clf = RandomForestClassifier(random_state =42)
clf.fit(X_train, y_train[target].values)
pred_test = clf.predict(X_test)
print(roc_auc_score(y_test, pred_test))
clf_disp = plot_roc_curve(clf, X_test, y_test)
plt.show()

Output of the code (the roc_auc_Score is just above the graph).

回答1:

The ROC Curve and the roc_auc_score take the prediction probabilities as input, but as I can see from your code you are providing the prediction labels. You need to fix that.

回答2:

You are feeding the prediction classes instead of prediction probabilities to roc_auc_score.

From Documentation:

y_score: array-like of shape (n_samples,) or (n_samples, n_classes)

Target scores. In the binary and multilabel cases, these can be either probability estimates or non-thresholded decision values (as returned by decision_function on some classifiers).

change your code to:


clf = RandomForestClassifier(random_state =42)
clf.fit(X_train, y_train[target].values)
y_score = clf.predict_prob(X_test)
print(roc_auc_score(y_test, y_score))

来源：https://stackoverflow.com/questions/60615281/different-result-roc-auc-score-and-plot-roc-curve

标签

scikit-learn

random-forest

roc

auc

易学教程内所有资源均来自网络或用户发布的内容，如有违反法律规定的内容欢迎反馈！
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!