Error - Error in lognet(x, is.sparse, ix, jx, y, weights, offset, alpha, nobs)= etc
Solution 1
The problem is in the model specification. If you use the caret train formula interface the training will work:
train <- data.frame(churn_x, churn_y)
model_glmnet <- train(churn_y ~ ., data = train,
metric = "ROC",
method = "glmnet",
trControl = myControl
)
> model_glmnet$results
alpha lambda ROC Sens Spec ROCSD SensSD SpecSD
1 0.10 0.0001754386 0.6958156 0.2845934 0.9123349 0.01855530 0.01616471 0.004002873
2 0.10 0.0017543858 0.7187303 0.2901986 0.9185721 0.01681286 0.01415863 0.005347573
3 0.10 0.0175438576 0.7399174 0.2355121 0.9487161 0.01482812 0.03932741 0.010769455
4 0.55 0.0001754386 0.6988285 0.2901800 0.9121614 0.01907845 0.01312159 0.004200233
5 0.55 0.0017543858 0.7260286 0.2946617 0.9185714 0.01761485 0.02171189 0.006755247
6 0.55 0.0175438576 0.7630039 0.2008939 0.9617103 0.01743847 0.03989938 0.006118592
7 1.00 0.0001754386 0.7009482 0.2924146 0.9119881 0.01958200 0.01233419 0.004157393
8 1.00 0.0017543858 0.7313495 0.2957728 0.9203040 0.01797853 0.02356945 0.008478577
9 1.00 0.0175438576 0.7672690 0.1595779 0.9760892 0.01935176 0.01935583 0.007938801
However when you specify x
and y
it will not work because glmnet takes the x
in the form of a model matrix, When you supply the formula to caret it will take care of model.matrix creation but if you just specify the x
and y
then it will assume x
is a model.matrix and will pass it to glmnet
. For instance this works:
x <- model.matrix(churn_y ~ ., data = train)
model_glmnet2 <- train(x = x, y = churn_y,
metric = "ROC",
method = "glmnet",
trControl = myControl
)
> model_glmnet2$results
alpha lambda ROC Sens Spec ROCSD SensSD SpecSD
1 0.10 0.0001754386 0.6958156 0.2845934 0.9123349 0.01855530 0.01616471 0.004002873
2 0.10 0.0017543858 0.7187303 0.2901986 0.9185721 0.01681286 0.01415863 0.005347573
3 0.10 0.0175438576 0.7399174 0.2355121 0.9487161 0.01482812 0.03932741 0.010769455
4 0.55 0.0001754386 0.6988285 0.2901800 0.9121614 0.01907845 0.01312159 0.004200233
5 0.55 0.0017543858 0.7260286 0.2946617 0.9185714 0.01761485 0.02171189 0.006755247
6 0.55 0.0175438576 0.7630039 0.2008939 0.9617103 0.01743847 0.03989938 0.006118592
7 1.00 0.0001754386 0.7009482 0.2924146 0.9119881 0.01958200 0.01233419 0.004157393
8 1.00 0.0017543858 0.7313495 0.2957728 0.9203040 0.01797853 0.02356945 0.008478577
9 1.00 0.0175438576 0.7672690 0.1595779 0.9760892 0.01935176 0.01935583 0.007938801
model.matrix
is needed only when there are factor features
Solution 2
If you want to use glmnet
and get the same error do this!
Short answer: using data.matrix()
fixed my issue!
Initially, I was doing:
# Given X and Y are datframes
cv.glmnet(x = as.matrix(X), y = as.matrix(Y), alpha = 1, family = "binomial")
This was fixed by:
cv.glmnet(x = data.matrix(X), y = as.matrix(Y), alpha = 1, family = "binomial")
Longer answer(not long at all):
I had the same problem, I was passing my X matrix using as.matrix()
which turns all elements of a data frame into a coercible type for all columns, if you happen to have factors in your data frame, as.matrix()
turns everything into a character. Usingdata.matrix()
fixed it for me. data.matrix()
can handle factors and ordered factor where as.matrix
is more basic.
MP61
Updated on July 02, 2022Comments
-
MP61 almost 2 years
Getting an error when using glmnet in Caret
Example below Load Libraries
library(dplyr) library(caret) library(C50)
Load churn data set from library C50
data(churn)
create x and y variables
churn_x <- subset(churnTest, select= -churn) churn_y <- churnTest[[20]]
Use createFolds() to create 5 CV folds on churn_y, the target variable
myFolds <- createFolds(churn_y, k = 5)
Create trainControl object: myControl
myControl <- trainControl( summaryFunction = twoClassSummary, classProbs = TRUE, # IMPORTANT! verboseIter = TRUE, savePredictions = TRUE, index = myFolds )
Fit glmnet model: model_glmnet
model_glmnet <- train( x = churn_x, y = churn_y, metric = "ROC", method = "glmnet", trControl = myControl )
Im getting the following error
Error in lognet(x, is.sparse, ix, jx, y, weights, offset, alpha, nobs, : NA/NaN/Inf in foreign function call (arg 5) In addition: Warning message: In lognet(x, is.sparse, ix, jx, y, weights, offset, alpha, nobs, : NAs introduced by coercion
I have checked and there are no missing values in the churn_x variables
sum(is.na(churn_x))
Does anyone know the answer?