How do I test data against a decision tree model in R?

12,926

To be able to use this, I assume you split up your training set into a subset training set and a test set.

To create the training model you can use:

model <- rpart(y~., traindata, minbucket=5)   # I suspect you did it so far.

To apply it to the test set:

pred <- predict(model, testdata) 

You then get a vector of predicted results.

In your training test data set you also have the "real" answer. Let's say the last column in the training set.

Simply equating them will yield the result:

pred == testdata[ , last]  # where 'last' equals the index of 'y'

When the elements are equal, you will get a TRUE, when you get a FALSE it means your prediction was wrong.

pred + testdata[, last] > 1 # gives TRUE positive, as it means both vectors are 1
pred == testdata[, last]    # gives those that are correct

It might be interesting to see how much percent you have correct:

mean(pred == testdata[ , last])    # here TRUE will count as a 1, and FALSE as 0
Share:
12,926

Related videos on Youtube

bernie2436
Author by

bernie2436

Updated on June 04, 2022

Comments

  • bernie2436
    bernie2436 about 2 years

    I built a decision tree from training data using the rpart package in R. Now i have more data and I want to check it against the tree to check the model. Logically/iteratively, I want to do the following:

    for each datapoint in new data
         run point thru decision tree, branching as appropriate
         examine how tree classifies the data point
         determine if the datapoint is a true positive or false positive
    

    How do I do that in R?