How to get confusion matrix when using model.fit_generator

孤人 提交于 2019-12-08 15:26:54

问题


I am using model.fit_generator to train and get results for my binary (two class) model because I am giving input images directly from my folder. How to get confusion matrix in this case (TP, TN, FP, FN) as well because generally I use confusion_matrix command of sklearn.metrics to get it, which requires predicted, and actual labels. But here I don't have both. May be I can calculate predicted labels from predict=model.predict_generator(validation_generator) command. But I don't know how my model is taking input labels from my images. General structure of my input folder is:

train/
 class1/
     img1.jpg
     img2.jpg
     ........
 class2/
     IMG1.jpg
     IMG2.jpg
test/
 class1/
     img1.jpg
     img2.jpg
     ........
 class2/
     IMG1.jpg
     IMG2.jpg
     ........

and some blocks of my code is:

train_generator = train_datagen.flow_from_directory('train',  
        target_size=(50, 50),  batch_size=batch_size,
        class_mode='binary',color_mode='grayscale')  


validation_generator = test_datagen.flow_from_directory('test',
        target_size=(50, 50),batch_size=batch_size,
        class_mode='binary',color_mode='grayscale')

model.fit_generator(
        train_generator,steps_per_epoch=250 ,epochs=40,
        validation_data=validation_generator,
        validation_steps=21 )

So the above code automatically takes two class inputs, but I don't know for which it consider class 0 and for which class 1.


回答1:


I've managed it in the following way, using keras.utils.Sequence.

from sklearn.metrics import confusion_matrix
from keras.utils import Sequence


class MySequence(Sequence):
    def __init__(self, *args, **kwargs):
        # initialize
        # see manual on implementing methods

    def __len__(self):
        return self.length

    def __getitem__(self, index):
        # return index-th complete batch


# create data generator
data_gen = MySequence(evaluation_set, batch_size=10) 

n_batches = len(data_gen)

confusion_matrix(
    np.concatenate([np.argmax(data_gen[i][1], axis=1) for i in range(n_batches)]),    
    np.argmax(m.predict_generator(data_gen, steps=n_batches), axis=1) 
)

The implemented class returns batches of data in tuples, that allows not to hold all of them in RAM. Please, note that it must be implemented in __getitem__, and this method must return same batch for the same argument.

Unfortunately this code iterates data twice: first time, it creates array of true answers from returned batches, the second time it calls predict method of the model.




回答2:


You can view the mapping from class names to class indices by calling the attribute class_indices on your train_generator or validation_generator objects, as in

train_generator.class_indices




回答3:


probabilities = model.predict_generator(generator=test_generator)

will give us set of probabilities.

y_true = test_generator.classes

will give us true labels.

Because this is a binary classification problem, you have to find predicted labels. To do that you can use

y_pred = probabilities > 0.5

Then we have true labels and predicted labels on the test dataset. So, the confusion matrix is given by

font = {
'family': 'Times New Roman',
'size': 12
}
matplotlib.rc('font', **font)
mat = confusion_matrix(y_true, y_pred)
plot_confusion_matrix(conf_mat=mat, figsize=(8, 8), show_normed=False)


来源:https://stackoverflow.com/questions/47907061/how-to-get-confusion-matrix-when-using-model-fit-generator

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!