I am creating a pipeline in scikit learn,
pipeline = Pipeline([
(\'bow\', CountVectorizer()),
(\'classifier\', BernoulliNB()),
])
a
Short answer is "you cannot".
You need to understand difference between cross_val_score
and cross validation as model selection method. cross_val_score
as name suggests, works only on scores. Confusion matrix is not a score, it is a kind of summary of what happened during evaluation. A major distinction is that a score is supposed to return an orderable object, in particular in scikit-learn - a float. So, based on score you can tell whether method b is better from a by simply comparing if b has bigger score. You cannot do this with confusion matrix which, again as name suggests, is a matrix.
If you want to obtain confusion matrices for multiple evaluation runs (such as cross validation) you have to do this by hand, which is not that bad in scikit-learn - it is actually a few lines of code.
kf = cross_validation.KFold(len(y), n_folds=5)
for train_index, test_index in kf:
X_train, X_test = X[train_index], X[test_index]
y_train, y_test = y[train_index], y[test_index]
model.fit(X_train, y_train)
print confusion_matrix(y_test, model.predict(X_test))