I try to train and test several scikit-learn models and attempt to print off the accuracy. Only some of these models work, others fail with the
ValueError: Classi
I have used a few models for stacking using the vecstack
and set needs_proba=True
and then got this error. I solved it by changing the metric inside the stacking. because stacking use class prediction by default, so in case you want to have probabilities you should change the metric as well. I have defined a new function as metric:
def get_classification_metric(testy, probs):
from sklearn.metrics import precision_recall_curve
precision, recall, thresholds = precision_recall_curve(testy, probs[:,1])
# convert to f score
fscore = (2 * precision * recall) / (precision + recall)
# locate the index of the largest f score
ix = np.argmax(fscore)
return fscore[ix]
This function finds the highest F1 score at optimal threshold. So only need to set metric=get_classification_metric
inside the stacking function.