type str doesn't define __round__ method error
38,073
More than likely some of the labels you have in y_train
are actually strings instead of numbers. sklearn
and xgboost
don't require the labels to be numeric.
Try checking the types of y_pred
.
from collections import Counter
Counter([type(value) for value in y_pred])
Here is an example of what I mean with numeric labels
import numpy as np
from sklearn.ensemble import GradientBoostingClassifier
# test with numeric labels
x = np.vstack([np.arange(100), np.sort(np.random.normal(10, size=100))]).T
y = np.hstack([np.zeros(50, dtype=int), np.ones(50, dtype=int)])
model = GradientBoostingClassifier()
model.fit(x,y)
model.predict([[10,7]])
# returns an array with a numeric
array([0])
and here with string labels (same x
data)
y = ['a']*50 + ['b']*50
model.fit(x,y)
model.predict([[10,7]])
# returns an array with a string label
array(['a'], dtype='<U1')
Both are value labels. However, when you attempt to use round
on a string variable, you get exactly the error you are seeing.
round('a')
TypeError: type str doesn't define __round__ method
Author by
rnv86
Updated on January 25, 2021Comments
-
rnv86 over 3 years
Trying to implement XGBoost to determine the most important variables, I have some error with the arrays.
My complete code is the following
from numpy import loadtxt from numpy import sort import pandas as pd from xgboost import XGBClassifier from sklearn.model_selection import train_test_split from sklearn.metrics import accuracy_score from sklearn.feature_selection import SelectFromModel df = pd.read_csv('data.txt') array=df.values X= array[:,0:330] Y = array[:,330] X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=0.33, random_state=7) model = XGBClassifier() model.fit(X_train, y_train) y_pred = model.predict(X_test) predictions = [round(value) for value in y_pred]
and I get the following error:
TypeError: type str doesn't define __round__ method
What can I do?