Can I get a list of wrong predictions in scikit-learn?

12,747

Solution 1

The simplest approach is just to iterate over your predictions (and correct classifications) and do whatever you want with the output (in the following example I will just print it to stdout).

Lets assume that your data is in inputs, labels, and your trained SVM is in clf, then you can just do

predictions = clf.predict(inputs)
for input, prediction, label in zip(inputs, predictions, labels):
  if prediction != label:
    print(input, 'has been classified as ', prediction, 'and should be ', label) 

Solution 2

It depends on what form you want the incorrect predictions to be in. For most use cases a confusion matrix should be sufficient.

A confusion matrix is a plot of the actual class vs the predicted class, such that the diagonal of the graph is all of the correct predictions, and the remaining cells are the incorrect predictions.

Confusion Matrix

You can see a better example of a confusion matrix on sklearn's Confusion Matrix example.

If you just want a list of all of the misclassified values with their predicted and actual classes, you can do something like the following.

Just select all of the rows of data where the actual and predicted classes are not equal.

import numpy as np
import pandas as pd

X = np.array([0.1, 0.34, 0.2, 0.98])
y = np.array(["A", "B", "A", "C"])

y_pred = np.array(["A", "C", "B", "C"])

df = pd.DataFrame(X, columns=["X"])
df["actual"] = y
df["predicted"] = y_pred

incorrect = df[df["actual"] != df["predicted"]]

In this case incorrect would contain the following entries.

      X actual predicted
1  0.34      B         C
2  0.20      A         B
Share:
12,747
Kumar Ashutosh
Author by

Kumar Ashutosh

Updated on June 05, 2022

Comments

  • Kumar Ashutosh
    Kumar Ashutosh almost 2 years

    We can use svm.SVC.score() to evaluate the accuracy of the SVM model. I want to get the predicted class and the actual class in case of wrong predictions. How can I achieve this in scikit-learn?