Learning Weka - Precision and Recall - Wiki example to .Arff file

11,008

Lets start with the basic definition of Precision and Recall.

Precision = TP/(TP+FP)
Recall = TP/(TP+FN)

Where TP is True Positive, FP is False Positive, and FN is False Negative.

In the above dog.arff file, Weka took into account only the first 7 tuples, it ignored the remaining 7. It can be seen from the above output that it has classified all the 7 tuples as correct(4 correct tuples + 3 wrong tuples).

Lets calculate the precision for correct and wrong class. First for the correct class:

Prec = 4/(4+3) = 0.571428571
Recall = 4/(4+0) = 1.

For wrong class:

Prec = 0/(0+0)= 0
recall =0/(0+3) = 0
Share:
11,008
Langeleppel
Author by

Langeleppel

Updated on June 04, 2022

Comments

  • Langeleppel
    Langeleppel almost 2 years

    I'm new to WEKA and advanced statistics, starting from scratch to understand the WEKA measures. I've done all the @rushdi-shams examples, which are great resources.

    On Wikipedia the http://en.wikipedia.org/wiki/Precision_and_recall examples explains with an simple example about a video software recognition of 7 dogs detection in a group of 9 real dogs and some cats. I perfectly understand the example, and the recall calculation. So my first step, let see in Weka how to reproduce with this data. How do I create such a .ARFF file? With this file I have a wrong Confusion Matrix, and the wrong Accuracy By Class Recall is not 1, it should be 4/9 (0.4444)

    @relation 'dogs and cat detection'
    
    @attribute              'realanimal'      {dog,cat}
    @attribute              'detected'        {dog,cat}
    @attribute              'class'           {correct,wrong}
    
    @data
    dog,dog,correct
    dog,dog,correct
    dog,dog,correct
    dog,dog,correct
    cat,dog,wrong
    cat,dog,wrong
    cat,dog,wrong
    dog,?,?
    dog,?,?
    dog,?,?
    dog,?,?
    dog,?,?
    cat,?,?
    cat,?,?
    

    Output Weka (without filters)

    === Run information ===

    Scheme:weka.classifiers.rules.ZeroR 
    Relation:     dogs and cat detection
    Instances:    14
    Attributes:   3
              realanimal
              detected
              class
    Test mode:10-fold cross-validation
    
    === Classifier model (full training set) ===
    
    ZeroR predicts class value: correct
    
    Time taken to build model: 0 seconds
    
    === Stratified cross-validation ===
    === Summary ===
    
    Correctly Classified Instances           4               57.1429 %
    Incorrectly Classified Instances         3               42.8571 %
    Kappa statistic                          0     
    Mean absolute error                      0.5   
    Root mean squared error                  0.5044
    Relative absolute error                100      %
    Root relative squared error            100      %
    Total Number of Instances                7     
    Ignored Class Unknown Instances          7     
    
    === Detailed Accuracy By Class ===
    
               TP Rate   FP Rate   Precision   Recall  F-Measure   ROC Area  Class
                 1         1          0.571     1         0.727      0.65     correct
                 0         0          0         0         0          0.136    wrong
    Weighted Avg.    0.571     0.571      0.327     0.571     0.416      0.43 
    
    === Confusion Matrix ===
    
     a b   <-- classified as
     4 0 | a = correct
     3 0 | b = wrong
    

    There must be something wrong with the False Negative dogs, or is my ARFF approach totally wrong and do I need another kind of attributes?

    Thanks