Learning Weka - Precision and Recall - Wiki example to .Arff file
Lets start with the basic definition of Precision and Recall.
Precision = TP/(TP+FP)
Recall = TP/(TP+FN)
Where TP
is True Positive, FP
is False Positive, and FN
is False Negative.
In the above dog.arff file, Weka took into account only the first 7 tuples, it ignored the remaining 7. It can be seen from the above output that it has classified all the 7 tuples as correct(4 correct tuples + 3 wrong tuples).
Lets calculate the precision for correct and wrong class. First for the correct class:
Prec = 4/(4+3) = 0.571428571
Recall = 4/(4+0) = 1.
For wrong class:
Prec = 0/(0+0)= 0
recall =0/(0+3) = 0
Langeleppel
Updated on June 04, 2022Comments
-
Langeleppel almost 2 years
I'm new to WEKA and advanced statistics, starting from scratch to understand the WEKA measures. I've done all the @rushdi-shams examples, which are great resources.
On Wikipedia the http://en.wikipedia.org/wiki/Precision_and_recall examples explains with an simple example about a video software recognition of 7 dogs detection in a group of 9 real dogs and some cats. I perfectly understand the example, and the recall calculation. So my first step, let see in Weka how to reproduce with this data. How do I create such a .ARFF file? With this file I have a wrong Confusion Matrix, and the wrong Accuracy By Class Recall is not 1, it should be 4/9 (0.4444)
@relation 'dogs and cat detection' @attribute 'realanimal' {dog,cat} @attribute 'detected' {dog,cat} @attribute 'class' {correct,wrong} @data dog,dog,correct dog,dog,correct dog,dog,correct dog,dog,correct cat,dog,wrong cat,dog,wrong cat,dog,wrong dog,?,? dog,?,? dog,?,? dog,?,? dog,?,? cat,?,? cat,?,?
Output Weka (without filters)
=== Run information ===
Scheme:weka.classifiers.rules.ZeroR Relation: dogs and cat detection Instances: 14 Attributes: 3 realanimal detected class Test mode:10-fold cross-validation === Classifier model (full training set) === ZeroR predicts class value: correct Time taken to build model: 0 seconds === Stratified cross-validation === === Summary === Correctly Classified Instances 4 57.1429 % Incorrectly Classified Instances 3 42.8571 % Kappa statistic 0 Mean absolute error 0.5 Root mean squared error 0.5044 Relative absolute error 100 % Root relative squared error 100 % Total Number of Instances 7 Ignored Class Unknown Instances 7 === Detailed Accuracy By Class === TP Rate FP Rate Precision Recall F-Measure ROC Area Class 1 1 0.571 1 0.727 0.65 correct 0 0 0 0 0 0.136 wrong Weighted Avg. 0.571 0.571 0.327 0.571 0.416 0.43 === Confusion Matrix === a b <-- classified as 4 0 | a = correct 3 0 | b = wrong
There must be something wrong with the False Negative dogs, or is my ARFF approach totally wrong and do I need another kind of attributes?
Thanks