Difference between predict(model) and predict(model$finalModel) using caret for classification in R

15,216

Frank,

This is really similar to your other question on Cross Validated.

You really need to

1) show your exact prediction code for each result

2) give us a reproducible example.

With the normal testSet, RF.CS and RF.CS$finalModel should not be giving you the same results and we should be able to reproduce that. Plus, there are syntax errors in your code so it can't be exactly what you executed.

Finally, I'm not really sure why you would use the finalModel object at all. The point of train is to handle the details and doing things this way (which is your option) circumvents the complete set of code that would normally be applied.

Here is a reproducible example:

 library(mlbench)
 data(Sonar)

 set.seed(1)
 inTrain <- createDataPartition(Sonar$Class)
 training <- Sonar[inTrain[[1]], ]
 testing <- Sonar[-inTrain[[1]], ]

 pp <- preProcess(training[,-ncol(Sonar)])
 training2 <- predict(pp, training[,-ncol(Sonar)])
 training2$Class <- training$Class
 testing2 <- predict(pp, testing[,-ncol(Sonar)])
 testing2$Class <- testing2$Class

 tc <- trainControl("repeatedcv", 
                    number=10, 
                    repeats=10, 
                    classProbs=TRUE, 
                    savePred=T)
 set.seed(2)
 RF <-  train(Class~., data= training, 
              method="rf", 
              trControl=tc)
 #normal trainingData
 set.seed(2)
 RF.CS <- train(Class~., data= training, 
                method="rf", 
                trControl=tc, 
                preProc=c("center", "scale")) 
 #scaled and centered trainingData

Here are some results:

 > ## These should not be the same
 > all.equal(predict(RF, testing,  type = "prob")[,1],
 +           predict(RF, testing2, type = "prob")[,1])
 [1] "Mean relative difference: 0.4067554"
 > 
 > ## Nor should these
 > all.equal(predict(RF.CS, testing,  type = "prob")[,1],
 +           predict(RF.CS, testing2, type = "prob")[,1])
 [1] "Mean relative difference: 0.3924037"
 > 
 > all.equal(predict(RF.CS,            testing, type = "prob")[,1],
 +           predict(RF.CS$finalModel, testing, type = "prob")[,1])
 [1] "names for current but not for target"
 [2] "Mean relative difference: 0.7452435" 
 >
 > ## These should be and are close (just based on the 
 > ## random sampling used in the final RF fits)
 > all.equal(predict(RF,    testing, type = "prob")[,1],
 +           predict(RF.CS, testing, type = "prob")[,1])
 [1] "Mean relative difference: 0.04198887"

Max

Share:
15,216
Frank
Author by

Frank

Updated on July 24, 2022

Comments

  • Frank
    Frank almost 2 years

    Whats the difference between

    predict(rf, newdata=testSet)
    

    and

    predict(rf$finalModel, newdata=testSet) 
    

    i train the model with preProcess=c("center", "scale")

    tc <- trainControl("repeatedcv", number=10, repeats=10, classProbs=TRUE, savePred=T)
    rf <- train(y~., data=trainingSet, method="rf", trControl=tc, preProc=c("center", "scale"))
    

    and i receive 0 true positives when i run it on a centered and scaled testSet

    testSetCS <- testSet
    xTrans <- preProcess(testSetCS)
    testSetCS<- predict(xTrans, testSet)
    testSet$Prediction <- predict(rf, newdata=testSet)
    testSetCS$Prediction <- predict(rf, newdata=testSetCS)
    

    but receive some true positives when i run it on an unscaled testSet. I have to use the rf$finalModel to receive some true postives on the centered and scaled testSet and the rf object on the unscaled...what am i missing?


    edit

    tests:

    tc <- trainControl("repeatedcv", number=10, repeats=10, classProbs=TRUE, savePred=T)
    RF <-  train(Y~., data= trainingSet, method="rf", trControl=tc) #normal trainingData
    RF.CS <- train(Y~., data= trainingSet, method="rf", trControl=tc, preProc=c("center", "scale")) #scaled and centered trainingData
    

    on normal testSet:

    RF predicts reasonable              (Sensitivity= 0.33, Specificity=0.97)
    RF$finalModel predicts bad       (Sensitivity= 0.74, Specificity=0.36)
    RF.CS predicts reasonable           (Sensitivity= 0.31, Specificity=0.97)
    RF.CS$finalModel same results like RF.CS    (Sensitivity= 0.31, Specificity=0.97)
    

    on centered and scaled testSetCS:

    RF predicts very bad                (Sensitivity= 0.00, Specificity=1.00)
    RF$finalModel predicts reasonable       (Sensitivity= 0.33, Specificity=0.98)
    RF.CS predicts like RF              (Sensitivity= 0.00, Specificity=1.00)
    RF.CS$finalModel predicts like RF       (Sensitivity= 0.00, Specificity=1.00)
    

    so it seems as if the $finalModel needs the same format of trainingSet and testSet whereas the trained object accepts only uncentered and unscaled data, regardless of the selected preProcess parameter?

    prediction code (where testSet is normal data and testSetCS is centered and scaled ):

    testSet$Prediction <- predict(RF, newdata=testSet)
    testSet$PredictionFM <- predict(RF$finalModel, newdata=testSet)
    testSet$PredictionCS <- predict(RF.CS, newdata=testSet)
    testSet$PredictionCSFM <- predict(RF.CS$finalModel, newdata=testSet)
    
    testSetCS$Prediction <- predict(RF, newdata=testSetCS)
    testSetCS$PredictionFM <- predict(RF$finalModel, newdata=testSetCS)
    testSetCS$PredictionCS <- predict(RF.CS, newdata=testSetCS)
    testSetCS$PredictionCSFM <- predict(RF.CS$finalModel, newdata=testSetCS)
    
  • Frank
    Frank over 10 years
    i used the $finalModel object because i thought that it contains the final (best) tree and therefore could calculate predictions and probabilities for new datasets.
  • topepo
    topepo over 10 years
    It does and that is what predict.train uses. However, it might do some things to the data in-between that matter.
  • usεr11852
    usεr11852 over 7 years
    Maybe you could clarify the differences between testing and testing2 due to preProcess as well as that the invocation on predict.train does uses preProcess internally while predict(xx$finalModel) does not. Otherwise the post reads a bit "voodoo-stuff-happens" as the role of preProcess is never clarified. (Obvious +1 though.)
  • jiggunjer
    jiggunjer about 4 years
    Shouldn't the last example compare RF + testing2 vs RF.CS + testing?