Error in y - ymean : non-numeric argument to binary operator randomForest R

18,824

I'd guess that labels is a character variable, but randomForest expects categorical outcome variables to be factors. Change it to a factor and see if the error goes away:

featureDF$labels = factor(featureDF$labels) 

The help for randomForest isn't explicit about the response needing to be a factor, but it's implied:

y  A response vector. If a factor, classification is assumed, otherwise   
   regression is assumed. If omitted, randomForest will run in unsupervised mode.

You haven't provided sample data, so here's an example with the built-in iris data:

Species is a factor in the original data frame. Let's convert Species to character:

iris$Species = as.character(iris$Species)
rf <- randomForest(Species ~ ., data=iris)
Error in y - ymean : non-numeric argument to binary operator

After converting Species back to factor, randomForest runs without error.

iris$Species = factor(iris$Species)
rf <- randomForest(Species ~ ., data=iris)
Share:
18,824
user5735224
Author by

user5735224

Updated on July 29, 2022

Comments

  • user5735224
    user5735224 almost 2 years

    I have a matrix which is about 37k x 1024 consisting of 1s and 0s as categorical variables to indicate the existence or absence of a feature vector. I ran this matrix through the randomForest package in R as follows :

    rfr <- randomForest(X_train,Y_train)
    

    Where X_train is the matrix containing the categorical variables and Y__train is a vector consisting of labels for every row in the matrix. When i run this, i get the following error :

    Error in y - ymean : non-numeric argument to binary operator
    In addition: Warning message:
    In mean.default(y) : argument is not numeric or logical: returning NA
    

    I checked for any null values or missing data but didnt find any.

    I even made the whole thing into a data.frame and tried the following

    rfr <- randomForest(labels ~ ., data = featureDF)
    

    Still had the same errors.

    I would appreciate any help with this, thanks!