问题
I am not able to get ROC function to work, I get the error "Predictor must be numeric or ordered".
I've looked through other posts, but nothing solves my problem. Any help is highly appreciated.
"Get data"
flying=dget("https://www.math.ntnu.no/emner/TMA4268/2019v/data/flying.dd")
ctrain=flying$ctrain
ctest=flying$ctest
library(MASS)
fly_qda=qda(diabetes~., data=ctrain)
#Test error is given below:
predict_qda=predict(fly_qda, newdata=ctest, probability=TRUE)
table_qda<-table(ctest$diabetes, predict_qda$class)
error_qda<-1-sum(diag(table_qda))/sum(table_qda)
error_qda
"ROC curve and AUC"
predict_qdatrain<-predict(fly_qda, newdata=ctrain)
roc_qda=roc(response=ctrain$diabetes, predictor= predict_qdatrain$class, plot=TRUE)
plot(roc_qda, col="red", lwd=3, main="ROC curve QDA")
auc_qda<-auc(roc_qda)
I want the plotted ROC curve and AUC
回答1:
As Ollie Perkins explained in his answer, the error you are getting indicates that your are passing something that is not of sortable nature and therefore cannot be used for ROC analysis.
In the case of the predict.qda, the class
item is a factor with 1
s and 0
s indicating the class.
Instead of converting the class to an ordered predictor, it is a better idea to use the posterior probabilities instead. Let's use the probability to belong to class 1
:
roc_qda <- roc(response = ctrain$diabetes, predictor = predict_qdatrain$posterior[,"1"])
plot(roc_qda, col="red", lwd=3, main="ROC curve QDA")
auc(roc_qda)
This will give you a smoother curve and more classification thresholds to choose from.
回答2:
So assuming you are using the pROC package, I have fixed this below. The error message means that the predictor variable has to either be of type numeric (a floating point integer) or an ordered factor (a categorical variable where the order of levels matters). Therefore, in order to calculate the ROC curve from your predict object, I have converted it on the fly below.
Secondly, in your original code, you were predicting onto the original training set. I have changed this to the test data below.
"Get data"
flying=dget("https://www.math.ntnu.no/emner/TMA4268/2019v/data/flying.dd")
ctrain=flying$ctrain
ctest=flying$ctest
library(MASS)
library(pROC)
fly_qda=qda(diabetes~., data=ctrain)
#Test error is given below:
predict_qda=predict(fly_qda, newdata=ctest, probability=TRUE)
table_qda<-table(ctest$diabetes, predict_qda$class)
error_qda<-1-sum(diag(table_qda))/sum(table_qda)
error_qda
"ROC curve and AUC"
predict_qdatrain<-predict(fly_qda, newdata=ctrain)
roc_qda=roc(response=ctrain$diabetes, predictor= factor(predict_qdatrain$class,
ordered = TRUE), plot=TRUE)
plot(roc_qda, col="red", lwd=3, main="ROC curve QDA")
auc_qda<-auc(roc_qda)
来源:https://stackoverflow.com/questions/55760669/roc-function-error-predictor-must-be-numeric-or-ordered