So I\'m using sci-kit learn to classify some data. I have 13 different class values/categorizes to classify the data to. Now I have been able to use cross validation and print t
It is important to ensure that the way you label your confusion matrix rows and columns corresponds exactly to the way sklearn has coded the classes. The true order of the labels can be revealed using the .classes_ attribute of the classifier. You can use the code below to prepare a confusion matrix data frame.
labels = rfc.classes_
conf_df = pd.DataFrame(confusion_matrix(class_label, class_label_predicted, columns=labels, index=labels))
conf_df.index.name = 'True labels'
The second thing to note is that your classifier is not predicting labels well. The number of correctly predicted labels is shown on the main diagonal of the confusion matrix. You have non-zero values accross the matrix and some classes have not been predicted at all - the columns that are all zero. It might be a good idea to run the classifier with its default parameters and then try to optimise them.