randomForest: Error in na.fail.default: missing values in object

12,017

Missing values would be in your predictors.

Try this code to remove rows which have empty values:

row.has.na <- apply(train, 1, function(x){any(is.na(x))})
predictors_no_NA <- train[!row.has.na, ]

Hopefully it helps.

Share:
12,017
Admin
Author by

Admin

Updated on June 09, 2022

Comments

  • Admin
    Admin almost 2 years

    I tried to train a random forest with cross validation and used the caret package to train the rf:

    ### variable return_customer = binary variable
    idx.train <- createDataPartition(y = known$return_customer, p = 0.8, list = FALSE)
    train <- known[idx.train, ]
    test <- known[-idx.train, ]
    k <- 10
    set.seed(123)
    model.control <- trainControl(method = "cv", number = k, classProbs = TRUE, summaryFunction = twoClassSummary,  allowParallel = TRUE)
    rf.parms <- expand.grid(mtry = 1:10)
    rf.caret <- train(return_customer~., data = train, method = "rf", ntree = 500, tuneGrid = rf.parms, metric = "ROC", trControl = model.control)
    

    When running the train function, I get this error code but there are no missing values in return_customer:

    Error in na.fail.default(list(return_customer = c(0L, 0L, 0L, 0L, 0L, : missing values in object

    I want to understand why the function is reading missing values in the data and how i can fix this issue. I am aware there are similar questions in the forum but i could not fix my code. Thanks!