How to solve "The data cannot have more levels than the reference" error when using confusioMatrix?
Solution 1
I had the same issue in classification. It turns out that there is ZERO observation in a specific group therefore I got the error "the data cannot have more levels than the reference”.
Make sure there all groups in your test set appears in your training set.
Solution 2
If you look carefully at your plots, you will see that you are training a regression tree and not a classication tree.
If you run credit$Creditability <- as.factor(credit$Creditability)
after reading in the data and use type = "class"
in the predict function, your code should work.
code:
credit <- read.csv("http://freakonometrics.free.fr/german_credit.csv" )
credit$Creditability <- as.factor(credit$Creditability)
library(caret)
library(tree)
library(e1071)
set.seed(1000)
intrain <- createDataPartition(y = credit$Creditability, p = 0.7, list = FALSE)
train <- credit[intrain, ]
test <- credit[-intrain, ]
treemod <- tree(Creditability ~ ., data = train, )
cv.trees <- cv.tree(treemod, FUN = prune.tree)
plot(cv.trees)
prune.trees <- prune.tree(treemod, best = 3)
plot(prune.trees)
text(prune.trees, pretty = 0)
treepred <- predict(prune.trees, newdata = test, type = "class")
confusionMatrix(treepred, test$Creditability)
Admin
Updated on June 09, 2022Comments
-
Admin almost 2 years
I'm using R programming. I divided the data as train & test for predicting accuracy.
This is my code:
library("tree") credit<-read.csv("C:/Users/Administrator/Desktop/german_credit (2).csv") library("caret") set.seed(1000) intrain<-createDataPartition(y=credit$Creditability,p=0.7,list=FALSE) train<-credit[intrain, ] test<-credit[-intrain, ] treemod<-tree(Creditability~. , data=train) plot(treemod) text(treemod) cv.trees<-cv.tree(treemod,FUN=prune.tree) plot(cv.trees) prune.trees<-prune.tree(treemod,best=3) plot(prune.trees) text(prune.trees,pretty=0) install.packages("e1071") library("e1071") treepred<-predict(prune.trees, newdata=test) confusionMatrix(treepred, test$Creditability)
The following error message happens in
confusionMatrix
:Error in confusionMatrix.default(rpartpred, test$Creditability) : the data cannot have more levels than the reference
The credit data can download at this site.
http://freakonometrics.free.fr/german_credit.csv -
StatMan over 7 yearsMore or less, the code then predicts the probabilities whether each entry
test
belong to class '0' and '1', so OP have to convert these predicted probabilities to predicted classifications.