type str doesn't define __round__ method error

38,073

More than likely some of the labels you have in y_train are actually strings instead of numbers. sklearn and xgboost don't require the labels to be numeric.

Try checking the types of y_pred.

from collections import Counter

Counter([type(value) for value in y_pred])

Here is an example of what I mean with numeric labels

import numpy as np
from sklearn.ensemble import GradientBoostingClassifier

# test with numeric labels
x = np.vstack([np.arange(100), np.sort(np.random.normal(10, size=100))]).T
y = np.hstack([np.zeros(50, dtype=int), np.ones(50, dtype=int)])
model = GradientBoostingClassifier()
model.fit(x,y)
model.predict([[10,7]])
# returns an array with a numeric 
array([0])

and here with string labels (same x data)

y = ['a']*50 + ['b']*50
model.fit(x,y)
model.predict([[10,7]])
# returns an array with a string label
array(['a'], dtype='<U1')

Both are value labels. However, when you attempt to use round on a string variable, you get exactly the error you are seeing.

round('a')

TypeError: type str doesn't define __round__ method
Share:
38,073
rnv86
Author by

rnv86

Updated on January 25, 2021

Comments

  • rnv86
    rnv86 over 3 years

    Trying to implement XGBoost to determine the most important variables, I have some error with the arrays.

    My complete code is the following

    from numpy import loadtxt
    from numpy import sort
    import pandas as pd
    from xgboost import XGBClassifier
    from sklearn.model_selection import train_test_split
    from sklearn.metrics import accuracy_score
    from sklearn.feature_selection import SelectFromModel
    
    
    df = pd.read_csv('data.txt')
    array=df.values
    X= array[:,0:330]
    Y = array[:,330]
    
    X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=0.33, random_state=7)
    
    
    model = XGBClassifier()
    model.fit(X_train, y_train)
    
    
    y_pred = model.predict(X_test)
    predictions = [round(value) for value in y_pred]
    

    and I get the following error:

    TypeError: type str doesn't define __round__ method
    

    What can I do?