Calculating prediction accuracy of a tree using rpart's predict method

24,569

Try calculating the confusion matrix first:

confMat <- table(test$class,t_pred)

Now you can calculate the accuracy by dividing the sum diagonal of the matrix - which are the correct predictions - by the total sum of the matrix:

accuracy <- sum(diag(confMat))/sum(confMat)
Share:
24,569
Arat254
Author by

Arat254

Updated on March 01, 2021

Comments

  • Arat254
    Arat254 over 3 years

    I have constructed a decision tree using rpart for a dataset.

    I have then divided the data into 2 parts - a training dataset and a test dataset. A tree has been constructed for the dataset using the training data. I want to calculate the accuracy of the predictions based on the model that was created.

    My code is shown below:

    library(rpart)
    #reading the data
    data = read.table("source")
    names(data) <- c("a", "b", "c", "d", "class")
    
    #generating test and train data - Data selected randomly with a 80/20 split
    trainIndex  <- sample(1:nrow(x), 0.8 * nrow(x))
    train <- data[trainIndex,]
    test <- data[-trainIndex,]
    
    #tree construction based on information gain
    tree = rpart(class ~ a + b + c + d, data = train, method = 'class', parms = list(split = "information"))
    

    I now want to calculate the accuracy of the predictions generated by the model by comparing the results with the actual values train and test data however I am facing an error while doing so.

    My code is shown below:

    t_pred = predict(tree,test,type="class")
    t = test['class']
    accuracy = sum(t_pred == t)/length(t)
    print(accuracy)
    

    I get an error message that states -

    Error in t_pred == t : comparison of these types is not implemented In addition: Warning message: Incompatible methods ("Ops.factor", "Ops.data.frame") for "=="

    On checking the type of t_pred, I found out that it is of type integer however the documentation

    (https://stat.ethz.ch/R-manual/R-devel/library/rpart/html/predict.rpart.html)

    states that the predict() method must return a vector.

    I am unable to understand why is the type of the variable is an integer and not a list. Where have I made the mistake and how can I fix it?

  • Arat254
    Arat254 over 7 years
    Thank you this worked. However I still do not understand the predict method. What exactly does it return and why is it an integer? When I print t_pred, it looks like a matrix.
  • mtoto
    mtoto over 7 years
    Hard to say what's going on without a reproducible example.
  • Arat254
    Arat254 over 7 years
    I used the Iris dataset for the above example but thats ok. I figured it now. Thanks again for your reply.