Error in ConfusionMatrix the data and reference factors must have the same number of levels

77,211

Solution 1

Try use:

confusionMatrix(table(Argument 1, Argument 2)) 

Thats worked for me.

Solution 2

Maybe your model is not predicting a certain factor. Use the table() function instead of confusionMatrix() to see if that is the problem.

Solution 3

Try specifying na.pass for the na.action option:

predictionsTree <- predict(treeFit, testdata,na.action = na.pass)

Solution 4

Change them into a data frame and then use them in confusionMatrix function:

pridicted <- factor(predict(treeFit, testdata))
real <- factor(testdata$catgeory)

my_data1 <- data.frame(data = pridicted, type = "prediction")
my_data2 <- data.frame(data = real, type = "real")
my_data3 <- rbind(my_data1,my_data2)

# Check if the levels are identical
identical(levels(my_data3[my_data3$type == "prediction",1]) , levels(my_data3[my_data3$type == "real",1]))

confusionMatrix(my_data3[my_data3$type == "prediction",1], my_data3[my_data3$type == "real",1],  dnn = c("Prediction", "Reference"))
Share:
77,211
user2987739
Author by

user2987739

Updated on July 13, 2022

Comments

  • user2987739
    user2987739 almost 2 years

    I've trained a tree model with R caret. I'm now trying to generate a confusion matrix and keep getting the following error:

    Error in confusionMatrix.default(predictionsTree, testdata$catgeory) : the data and reference factors must have the same number of levels

    prob <- 0.5 #Specify class split
    singleSplit <- createDataPartition(modellingData2$category, p=prob,
                                       times=1, list=FALSE)
    cvControl <- trainControl(method="repeatedcv", number=10, repeats=5)
    traindata <- modellingData2[singleSplit,]
    testdata <- modellingData2[-singleSplit,]
    treeFit <- train(traindata$category~., data=traindata,
                     trControl=cvControl, method="rpart", tuneLength=10)
    predictionsTree <- predict(treeFit, testdata)
    confusionMatrix(predictionsTree, testdata$catgeory)
    

    The error occurs when generating the confusion matrix. The levels are the same on both objects. I cant figure out what the problem is. Their structure and levels are given below. They should be the same. Any help would be greatly appreciated as its making me cracked!!

    > str(predictionsTree)
     Factor w/ 30 levels "16-Merchant Service Charge",..: 28 22 22 22 22 6 6 6 6 6 ...
    > str(testdata$category)
     Factor w/ 30 levels "16-Merchant Service Charge",..: 30 30 7 7 7 7 7 30 7 7 ...
    
    > levels(predictionsTree)
     [1] "16-Merchant Service Charge"   "17-Unpaid Cheque Fee"         "18-Gov. Stamp Duty"           "Misc"                         "26-Standard Transfer Charge" 
     [6] "29-Bank Giro Credit"          "3-Cheques Debit"              "32-Standing Order - Debit"    "33-Inter Branch Payment"      "34-International"            
    [11] "35-Point of Sale"             "39-Direct Debits Received"    "4-Notified Bank Fees"         "40-Cash Lodged"               "42-International Receipts"   
    [16] "46-Direct Debits Paid"        "56-Credit Card Receipts"      "57-Inter Branch"              "58-Unpaid Items"              "59-Inter Company Transfers"  
    [21] "6-Notified Interest Credited" "61-Domestic"                  "64-Charge Refund"             "66-Inter Company Transfers"   "67-Suppliers"                
    [26] "68-Payroll"                   "69-Domestic"                  "73-Credit Card Payments"      "82-CHAPS Fee"                 "Uncategorised"   
    
    > levels(testdata$category)
     [1] "16-Merchant Service Charge"   "17-Unpaid Cheque Fee"         "18-Gov. Stamp Duty"           "Misc"                         "26-Standard Transfer Charge" 
     [6] "29-Bank Giro Credit"          "3-Cheques Debit"              "32-Standing Order - Debit"    "33-Inter Branch Payment"      "34-International"            
    [11] "35-Point of Sale"             "39-Direct Debits Received"    "4-Notified Bank Fees"         "40-Cash Lodged"               "42-International Receipts"   
    [16] "46-Direct Debits Paid"        "56-Credit Card Receipts"      "57-Inter Branch"              "58-Unpaid Items"              "59-Inter Company Transfers"  
    [21] "6-Notified Interest Credited" "61-Domestic"                  "64-Charge Refund"             "66-Inter Company Transfers"   "67-Suppliers"                
    [26] "68-Payroll"                   "69-Domestic"                  "73-Credit Card Payments"      "82-CHAPS Fee"                 "Uncategorised"