Confusion matrix on images in CNN keras

43,873

Solution 1

Here's how to get the confusion matrix(or maybe statistics using scikit-learn) for all classes:

1.Predict classes

test_generator = ImageDataGenerator()
test_data_generator = test_generator.flow_from_directory(
    test_data_path, # Put your path here
     target_size=(img_width, img_height),
    batch_size=32,
    shuffle=False)
test_steps_per_epoch = numpy.math.ceil(test_data_generator.samples / test_data_generator.batch_size)

predictions = model.predict_generator(test_data_generator, steps=test_steps_per_epoch)
# Get most likely class
predicted_classes = numpy.argmax(predictions, axis=1)

2.Get ground-truth classes and class-labels

true_classes = test_data_generator.classes
class_labels = list(test_data_generator.class_indices.keys())   

3. Use scikit-learn to get statistics

report = metrics.classification_report(true_classes, predicted_classes, target_names=class_labels)
print(report)    

You can read more here

EDIT: If the above does not work, have a look at this video Create confusion matrix for predictions from Keras model. Probably look through the comments if you have an issue. Or Make predictions with a Keras CNN Image Classifier

Solution 2

Why would the scikit-learn function not do the job? You forward pass all your samples (images) in the train/test set, convert one-hot-encoding to label encoding (see link) and pass it into sklearn.metrics.confusion_matrix as y_pred. You proceed in a similar fashion with y_true (one-hot to label).

Sample code:

import sklearn.metrics as metrics

y_pred_ohe = KerasClassifier.predict(X)  # shape=(n_samples, 12)
y_pred_labels = np.argmax(y_pred_ohe, axis=1)  # only necessary if output has one-hot-encoding, shape=(n_samples)

confusion_matrix = metrics.confusion_matrix(y_true=y_true_labels, y_pred=y_pred_labels)  # shape=(12, 12)

Solution 3

Here cats and dogs are the class labels:

#Confusion Matrix and Classification Report
from sklearn.metrics import classification_report
from sklearn.metrics import confusion_matrix

Y_pred = model.predict_generator(validation_generator, nb_validation_samples // 
batch_size+1)
y_pred = np.argmax(Y_pred, axis=1)

print('Confusion Matrix')
print(confusion_matrix(validation_generator.classes, y_pred))

print('Classification Report')
target_names = ['Cats', 'Dogs']
print(classification_report(validation_generator.classes, y_pred, target_names=target_names))
Share:
43,873
Abhishek Singh
Author by

Abhishek Singh

Updated on December 14, 2020

Comments

  • Abhishek Singh
    Abhishek Singh over 3 years

    I have trained my model(multiclass classification) of CNN using keras and now I want to evaluate the model on my test set of images.

    What are the possible options for evaluating my model apart from the accuracy, precision and recall? I know how to get the precision and recall from a custom script. But I cannot find a way to get the confusion matrix for my 12 classes of images. Scikit-learn shows a way, but not for images. I am using model.fit_generator ()

    Is there a way to create confusion matrix for all my classes or finding classification confidence on my classes? I am using Google Colab, though I can download the model and run it locally.

    Any help would be appreciated.

    Code:

    train_data_path = 'dataset_cfps/train'
    validation_data_path = 'dataset_cfps/validation'
    
    #Parametres
    img_width, img_height = 224, 224
    
    vggface = VGGFace(model='resnet50', include_top=False, input_shape=(img_width, img_height, 3))
    
    #vgg_model = VGGFace(include_top=False, input_shape=(224, 224, 3))
    
    last_layer = vggface.get_layer('avg_pool').output
    x = Flatten(name='flatten')(last_layer)
    xx = Dense(256, activation = 'sigmoid')(x)
    x1 = BatchNormalization()(xx)
    x2 = Dropout(0.3)(x1)
    y = Dense(256, activation = 'sigmoid')(x2)
    yy = BatchNormalization()(y)
    y1 = Dropout(0.6)(yy)
    x3 = Dense(12, activation='sigmoid', name='classifier')(y1)
    
    custom_vgg_model = Model(vggface.input, x3)
    
    
    # Create the model
    model = models.Sequential()
    
    # Add the convolutional base model
    model.add(custom_vgg_model)
    
    model.summary()
    #model = load_model('facenet_resnet_lr3_SGD_sameas1.h5')
    
    def recall(y_true, y_pred):
         true_positives = K.sum(K.round(K.clip(y_true * y_pred, 0, 1)))
         possible_positives = K.sum(K.round(K.clip(y_true, 0, 1)))
         recall = true_positives / (possible_positives + K.epsilon())
         return recall
    
    def precision(y_true, y_pred):
        true_positives = K.sum(K.round(K.clip(y_true * y_pred, 0, 1)))
        predicted_positives = K.sum(K.round(K.clip(y_pred, 0, 1)))
        precision = true_positives / (predicted_positives + K.epsilon())
        return precision
    
    train_datagen = ImageDataGenerator(
          rescale=1./255,
          rotation_range=20,
          width_shift_range=0.2,
          height_shift_range=0.2,
          horizontal_flip=True,
          fill_mode='nearest')
    
    
    validation_datagen = ImageDataGenerator(rescale=1./255)
    
    # Change the batchsize according to your system RAM
    train_batchsize = 32
    val_batchsize = 32
    
    train_generator = train_datagen.flow_from_directory(
            train_data_path,
            target_size=(img_width, img_height),
            batch_size=train_batchsize,
            class_mode='categorical')
    
    validation_generator = validation_datagen.flow_from_directory(
            validation_data_path,
            target_size=(img_width, img_height),
            batch_size=val_batchsize,
            class_mode='categorical',
            shuffle=True)
    
    # Compile the model
    model.compile(loss='categorical_crossentropy',
                  optimizer=optimizers.SGD(lr=1e-3),
                  metrics=['acc', recall, precision])
    # Train the model
    history = model.fit_generator(
          train_generator,
          steps_per_epoch=train_generator.samples/train_generator.batch_size ,
          epochs=100,
          validation_data=validation_generator,
          validation_steps=validation_generator.samples/validation_generator.batch_size,
          verbose=1)
    
    # Save the model
    model.save('facenet_resnet_lr3_SGD_new_FC.h5')
    
  • Abhishek Singh
    Abhishek Singh almost 6 years
    Could you please show some code so that I Can better understand and up vote/ accept your answer and close it?
  • Abhishek Singh
    Abhishek Singh almost 6 years
    Hey @Jan K I have updated my code. Thank you for helping out. What can I add here?
  • Nguai al
    Nguai al about 2 years
    In my case, how is that model.evaluate() returns accuracy rate of 75 percent but F1 score is only 25%? I would think that evaluate() and predict should be the same.