Can I get a list of wrong predictions in scikit-learn?

python machine-learning scikit-learn svm

12,747

Solution 1

The simplest approach is just to iterate over your predictions (and correct classifications) and do whatever you want with the output (in the following example I will just print it to stdout).

Lets assume that your data is in inputs, labels, and your trained SVM is in clf, then you can just do

predictions = clf.predict(inputs)
for input, prediction, label in zip(inputs, predictions, labels):
  if prediction != label:
    print(input, 'has been classified as ', prediction, 'and should be ', label)

Solution 2

It depends on what form you want the incorrect predictions to be in. For most use cases a confusion matrix should be sufficient.

A confusion matrix is a plot of the actual class vs the predicted class, such that the diagonal of the graph is all of the correct predictions, and the remaining cells are the incorrect predictions.

You can see a better example of a confusion matrix on sklearn's Confusion Matrix example.

If you just want a list of all of the misclassified values with their predicted and actual classes, you can do something like the following.

Just select all of the rows of data where the actual and predicted classes are not equal.

import numpy as np
import pandas as pd

X = np.array([0.1, 0.34, 0.2, 0.98])
y = np.array(["A", "B", "A", "C"])

y_pred = np.array(["A", "C", "B", "C"])

df = pd.DataFrame(X, columns=["X"])
df["actual"] = y
df["predicted"] = y_pred

incorrect = df[df["actual"] != df["predicted"]]

In this case incorrect would contain the following entries.

      X actual predicted
1  0.34      B         C
2  0.20      A         B

12,747

Author by

Kumar Ashutosh

Updated on June 05, 2022

Comments

Kumar Ashutosh almost 2 years

We can use svm.SVC.score() to evaluate the accuracy of the SVM model. I want to get the predicted class and the actual class in case of wrong predictions. How can I achieve this in scikit-learn?