Logistic Regression on factor: Error in eval(family$initialize) : y values must be 0 <= y <= 1

67,643

Solution 1

The reason it's asking for y values between 0 and 1 is because the categorical features in your data such as 'direction' are of type 'character'. You need to convert them to type 'factor' with as.factor(data$Direction). So: glm(Direction ~ lag2, data=...) Don't need to declare stock.direction.

You can check the class of variables by using the command class(variable), and if they're character, you can convert to factor and create a new column in the same data frame. It should work then.

Solution 2

I was getting the same error "Error in eval(family$initialize) : y values must be 0 <= y <= 1" and solved it by adding "stringsAsFactors=T" to the red.csv function.

BEFORE : gene.train = read.csv("gene.train.csv", header=T) # error

AFTER : gene.train = read.csv("gene.train.csv", header=T, stringsAsFactors=T) # no error.

Share:
67,643
Admin
Author by

Admin

Updated on July 09, 2022

Comments

  • Admin
    Admin almost 2 years

    Not able to fix the below error for the below logistic regression

    training=(IBM$Serial<625)
    data=IBM[!training,]
    dim(data)
    stock.direction <- data$Direction
    training_model=glm(stock.direction~data$lag2,data=data,family=binomial)
    ###Error### ----  Error in eval(family$initialize) : y values must be 0 <= y <= 1
    

    Few rows from the data i am using

    X   Date    Open    High    Low Close   Adj.Close   Volume  Return  lag1    lag2    lag3    Direction   Serial
    1   28-11-2012  190.979996  192.039993  189.270004  191.979996  165.107727  3603600 0.004010855 0.004010855 -0.001198021    -0.006354834    Up  1
    2   29-11-2012  192.75  192.899994  190.199997  191.529999  164.720734  4077900 0.00114865  0.00114865  -0.004020279    -0.009502386    Up  2
    3   30-11-2012  191.75  192 189.5   190.070007  163.465073  4936400 0.003630178 0.003630178 -0.001894039    -0.005576956    Up  3
    4   03-12-2012  190.759995  191.300003  188.360001  189.479996  162.957703  3349600 0.001213907 0.001213907 -0.002480478    -0.001636046    Up  4
    
  • smci
    smci about 6 years
    Just reference as.factor(data$Direction). So: glm(Direction ~ lag2, data=...) Don't need to declare stock.direction.