sklearn GridSearchCV (Scoring Function error)

问题

I was wondering if you can help me out with an error I am receiving in running grid search. I think it might due to misunderstanding on how grid search actually works.

I am now running an application where I need grid search to evaluate best parameters using a different scoring function. I am using RandomForestClassifier to fit a large X dataset to a characterization vector Y which is a list of 0s and 1s. (completely binary). My scoring function (MCC) requires the prediction input and actual input to be completely binary. However, for some reason I keep getting the ValueError: multiclass is not supported.

My understanding is that the grid search, does cross validation on the data set, comes up with a prediction input that is based on the cross validation, then insets the characterization vector and the prediction into the function. Since my characterization vector is completely binary, my prediction vector should also be binary as well and cause no problem when evaluating the score. When I run random forest with a single defined parameter (without using grid search), inserting the predicted data and characterization vector into MCC scoring functions runs perfectly fine. So I am a little lost on how running the grid search would cause any errors.

Snapshot of Data:

        print len(X)
        print X[0]
        print len(Y)
        print Y[2990:3000]
17463699
[38.110903683955435, 38.110903683955435, 38.110903683955435, 9.899495124816895, 294.7808837890625, 292.3835754394531, 293.81494140625, 291.11065673828125, 293.51739501953125, 283.6424865722656, 13.580912590026855, 4.976086616516113, 1.1271398067474365, 0.9465181231498718, 0.5066819190979004, 0.1808401197195053, 0.0]
17463699
[0.0, 0.0, 1.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]

Code:

def overall_average_score(actual,prediction):
    precision = precision_recall_fscore_support(actual, prediction, average = 'binary')[0]
    recall = precision_recall_fscore_support(actual, prediction, average = 'binary')[1]
    f1_score = precision_recall_fscore_support(actual, prediction, average = 'binary')[2]
    total_score = matthews_corrcoef(actual, prediction)+accuracy_score(actual, prediction)+precision+recall+f1_score
    return total_score/5

grid_scorer = make_scorer(overall_average_score, greater_is_better=True)
parameters = {'n_estimators': [10,20,30], 'max_features': ['auto','sqrt','log2',0.5,0.3], }
random  = RandomForestClassifier()
clf = grid_search.GridSearchCV(random, parameters, cv = 5, scoring = grid_scorer)
clf.fit(X,Y)

Error:

ValueError                                Traceback (most recent call last)
<ipython-input-39-a8686eb798b2> in <module>()
     18 random  = RandomForestClassifier()
     19 clf = grid_search.GridSearchCV(random, parameters, cv = 5, scoring = grid_scorer)
---> 20 clf.fit(X,Y)
     21 
     22 

/shared/studies/nonregulated/neurostream/neurostream/local/lib/python2.7/site-packages/sklearn/grid_search.pyc in fit(self, X, y)
    730 
    731         """
--> 732         return self._fit(X, y, ParameterGrid(self.param_grid))
    733 
    734 

/shared/studies/nonregulated/neurostream/neurostream/local/lib/python2.7/site-packages/sklearn/grid_search.pyc in _fit(self, X, y, parameter_iterable)
    503                                     self.fit_params, return_parameters=True,
    504                                     error_score=self.error_score)
--> 505                 for parameters in parameter_iterable
    506                 for train, test in cv)
    507 

/shared/studies/nonregulated/neurostream/neurostream/local/lib/python2.7/site-packages/sklearn/externals/joblib/parallel.pyc in __call__(self, iterable)
    657             self._iterating = True
    658             for function, args, kwargs in iterable:
--> 659                 self.dispatch(function, args, kwargs)
    660 
    661             if pre_dispatch == "all" or n_jobs == 1:

/shared/studies/nonregulated/neurostream/neurostream/local/lib/python2.7/site-packages/sklearn/externals/joblib/parallel.pyc in dispatch(self, func, args, kwargs)
    404         """
    405         if self._pool is None:
--> 406             job = ImmediateApply(func, args, kwargs)
    407             index = len(self._jobs)
    408             if not _verbosity_filter(index, self.verbose):

/shared/studies/nonregulated/neurostream/neurostream/local/lib/python2.7/site-packages/sklearn/externals/joblib/parallel.pyc in __init__(self, func, args, kwargs)
    138         # Don't delay the application, to avoid keeping the input
    139         # arguments in memory
--> 140         self.results = func(*args, **kwargs)
    141 
    142     def get(self):

/shared/studies/nonregulated/neurostream/neurostream/local/lib/python2.7/site-packages/sklearn/cross_validation.pyc in _fit_and_score(estimator, X, y, scorer, train, test, verbose, parameters, fit_params, return_train_score, return_parameters, error_score)
   1476 
   1477     else:
-> 1478         test_score = _score(estimator, X_test, y_test, scorer)
   1479         if return_train_score:
   1480             train_score = _score(estimator, X_train, y_train, scorer)

/shared/studies/nonregulated/neurostream/neurostream/local/lib/python2.7/site-packages/sklearn/cross_validation.pyc in _score(estimator, X_test, y_test, scorer)
   1532         score = scorer(estimator, X_test)
   1533     else:
-> 1534         score = scorer(estimator, X_test, y_test)
   1535     if not isinstance(score, numbers.Number):
   1536         raise ValueError("scoring must return a number, got %s (%s) instead."

/shared/studies/nonregulated/neurostream/neurostream/local/lib/python2.7/site-packages/sklearn/metrics/scorer.pyc in __call__(self, estimator, X, y_true, sample_weight)
     87         else:
     88             return self._sign * self._score_func(y_true, y_pred,
---> 89                                                  **self._kwargs)
     90 
     91 

<ipython-input-39-a8686eb798b2> in overall_average_score(actual, prediction)
      3     recall = precision_recall_fscore_support(actual, prediction, average = 'binary')[1]
      4     f1_score = precision_recall_fscore_support(actual, prediction, average = 'binary')[2]
----> 5     total_score = matthews_corrcoef(actual, prediction)+accuracy_score(actual, prediction)+precision+recall+f1_score
      6     return total_score/5
      7 def show_score(actual,prediction):

/shared/studies/nonregulated/neurostream/neurostream/local/lib/python2.7/site-packages/sklearn/metrics/classification.pyc in matthews_corrcoef(y_true, y_pred)
    395 
    396     if y_type != "binary":
--> 397         raise ValueError("%s is not supported" % y_type)
    398 
    399     lb = LabelEncoder()

ValueError: multiclass is not supported

回答1:

I reproduced your experiment but I do not get any error. The error indicates one of your vectors actual or prediction contains more than two discrete values.

It is indeed weird that you are able to score a random forest trained outside GridSearchCV.
Could you provide the exact code you run to do this?

Here's the code I used to try to reproduce the error:

from sklearn.datasets import make_classification
from sklearn.grid_search import GridSearchCV
from sklearn.metrics import precision_recall_fscore_support, accuracy_score, \
    matthews_corrcoef, make_scorer
from sklearn.ensemble import RandomForestClassifier
from sklearn.cross_validation import train_test_split

def overall_average_score(actual,prediction):
    precision, recall, f1_score, _ = precision_recall_fscore_support(
        actual, prediction, average='binary')
    total_score = (matthews_corrcoef(actual, prediction) +
        accuracy_score(actual, prediction) + precision + recall + f1_score)
    return total_score / 5

grid_scorer = make_scorer(overall_average_score, greater_is_better=True)

print("Without GridSearchCV")
X, y = make_classification(n_samples=500, n_informative=10, n_classes=2)
X_train, X_test, y_train, y_test = train_test_split(X, y,
    test_size=0.5, random_state=0)
rf = RandomForestClassifier()
rf.fit(X_train, y_train)
y_pred = rf.predict(X_test)
print("Overall average score: ", overall_average_score(y_test, y_pred))

print("-" * 30)
print("With GridSearchCV:")

parameters = {'n_estimators': [10,20,30],
              'max_features': ['auto','sqrt','log2',0.5,0.3], }
gs_rf = GridSearchCV(rf, parameters, cv=5, scoring=grid_scorer)
gs_rf.fit(X_train,y_train)
print("Best score with grid search: ", gs_rf.best_score_)

Now I'd like to make a few comments on the code you provided:

It's not a great practice to use variable names such as random (this is usually a module) or f1_score (this conflicts with the sklearn.metrics.f1_score method).
You could unpack precision, recall and f1_score directly instead of calling 3 times precision_recall_fscore_support.
It does not really make sense to grid search on n_estimators: more trees is always better. If you are worried about overfitting you can reduce the complexity of the individual models by using other parameters such as max_depth or min_samples_split.

回答2:

Matthews Correlation Coefficient is a score between -1 and 1. So, it is not correct to calculate the average between f1_score, precision, recall, accuracy_score and MCC.

MCC values indicate: 1 is total positive correlation 0 is no correlation −1 is total negative correlation

While the other above mentioned evaluation metrics are between 0 and 1 (from worst to best accuracy index). The range and the significance is not the same.

来源：https://stackoverflow.com/questions/31615190/sklearn-gridsearchcv-scoring-function-error

标签

python

machine-learning

scikit-learn

grid-search