How to get precision, recall and f-measure from confusion matrix in Python

python machine-learning scikit-learn confusion-matrix precision-recall

10,372

Solution 1

Let's consider the case of MNIST data classification (10 classes), where for a test set of 10,000 samples we get the following confusion matrix cm (Numpy array):

array([[ 963,    0,    0,    1,    0,    2,   11,    1,    2,    0],
       [   0, 1119,    3,    2,    1,    0,    4,    1,    4,    1],
       [  12,    3,  972,    9,    6,    0,    6,    9,   13,    2],
       [   0,    0,    8,  975,    0,    2,    2,   10,   10,    3],
       [   0,    2,    3,    0,  953,    0,   11,    2,    3,    8],
       [   8,    1,    0,   21,    2,  818,   17,    2,   15,    8],
       [   9,    3,    1,    1,    4,    2,  938,    0,    0,    0],
       [   2,    7,   19,    2,    2,    0,    0,  975,    2,   19],
       [   8,    5,    4,    8,    6,    4,   14,   11,  906,    8],
       [  11,    7,    1,   12,   16,    1,    1,    6,    5,  949]])

In order to get the precision & recall (per class), we need to compute the TP, FP, and FN per class. We don't need TN, but we will compute it, too, as it will help us for our sanity check.

The True Positives are simply the diagonal elements:

# numpy should have already been imported as np
TP = np.diag(cm)
TP
# array([ 963, 1119,  972,  975,  953,  818,  938,  975,  906,  949])

The False Positives are the sum of the respective column, minus the diagonal element (i.e. the TP element):

FP = np.sum(cm, axis=0) - TP
FP
# array([50, 28, 39, 56, 37, 11, 66, 42, 54, 49])

Similarly, the False Negatives are the sum of the respective row, minus the diagonal (i.e. TP) element:

FN = np.sum(cm, axis=1) - TP
FN
# array([17, 16, 60, 35, 29, 74, 20, 53, 68, 60])

Now, the True Negatives are a little trickier; let's first think what exactly a True Negative means, with respect to, say class 0: it means all the samples that have been correctly identified as not being 0. So, essentially what we should do is remove the corresponding row & column from the confusion matrix, and then sum up all the remaining elements:

num_classes = 10
TN = []
for i in range(num_classes):
    temp = np.delete(cm, i, 0)    # delete ith row
    temp = np.delete(temp, i, 1)  # delete ith column
    TN.append(sum(sum(temp)))
TN
# [8970, 8837, 8929, 8934, 8981, 9097, 8976, 8930, 8972, 8942]

Let's make a sanity check: for each class, the sum of TP, FP, FN, and TN must be equal to the size of our test set (here 10,000): let's confirm that this is indeed the case:

l = 10000
for i in range(num_classes):
    print(TP[i] + FP[i] + FN[i] + TN[i] == l)

The result is

True
True
True
True
True
True
True
True
True
True

Having calculated these quantities, it is now straightforward to get the precision & recall per class:

precision = TP/(TP+FP)
recall = TP/(TP+FN)

which for this example are

precision
# array([ 0.95064166,  0.97558849,  0.96142433,  0.9456838 ,  0.96262626,
#         0.986731  ,  0.93426295,  0.95870206,  0.94375   ,  0.9509018])

recall
# array([ 0.98265306,  0.98590308,  0.94186047,  0.96534653,  0.97046843,
#         0.91704036,  0.97912317,  0.94844358,  0.9301848 ,  0.94053518])

Similarly we can compute related quantities, like specificity (recall that sensitivity is the same thing with recall):

specificity = TN/(TN+FP)

Results for our example:

specificity
# array([0.99445676, 0.99684151, 0.9956512 , 0.99377086, 0.99589709,
#        0.99879227, 0.99270073, 0.99531877, 0.99401728, 0.99455011])

You should now be able to compute these quantities virtually for any size of your confusion matrix.

Solution 2

If you have confusion matrix in the form of:

cmat = [[ 5,  7], 
        [25, 37]]

Following simple function can be made:

def myscores(smat): 
    tp = smat[0][0] 
    fp = smat[0][1] 
    fn = smat[1][0] 
    tn = smat[1][1] 
    return tp/(tp+fp), tp/(tp+fn)

Testing:

print("precision and recall:", myscores(cmat))

Output:

precision and recall: (0.4166666666666667, 0.16666666666666666)

Above function can also be extended to produce other scores, the formulae for which are mentioned on https://en.wikipedia.org/wiki/Confusion_matrix

Solution 3

There is a package called 'disarray'.

So, if I have four classes :

import numpy as np
a = np.random.randint(0,4,[100])
b = np.random.randint(0,4,[100])

I can use disarray to calculate 13 matrices :

import disarray

# Instantiate the confusion matrix DataFrame with index and columns
cm = confusion_matrix(a,b)
df = pd.DataFrame(cm, index= ['a','b','c','d'], columns=['a','b','c','d'])
df.da.export_metrics()

which gives :

10,372

Author by

ryo

Updated on June 12, 2022

Comments

ryo almost 2 years

I'm using Python and have some confusion matrixes. I'd like to calculate precisions and recalls and f-measure by confusion matrixes in multiclass classification. My result logs don't contain y_true and y_pred, just contain confusion matrix.

Could you tell me how to get these scores from confusion matrix in multiclass classification?