multiclass classification in xgboost (python)
16,585
You don't need to set num_class
in the scikit-learn API for XGBoost classification. It is done automatically when fit
is called. Look at xgboost/sklearn.py at the beginning of the fit
method of XGBClassifier
:
evals_result = {}
self.classes_ = np.unique(y)
self.n_classes_ = len(self.classes_)
xgb_options = self.get_xgb_params()
if callable(self.objective):
obj = _objective_decorator(self.objective)
# Use default value. Is it really not used ?
xgb_options["objective"] = "binary:logistic"
else:
obj = None
if self.n_classes_ > 2:
# Switch to using a multiclass objective in the underlying XGB instance
xgb_options["objective"] = "multi:softprob"
xgb_options['num_class'] = self.n_classes_
Author by
user3804483
Updated on June 13, 2022Comments
-
user3804483 almost 2 years
I can't figure out how to pass number of classes or eval metric to xgb.XGBClassifier with the objective function 'multi:softmax'.
I looked at many documentations but the only talk about the sklearn wrapper which accepts n_class/num_class.
My current setup looks like
kf = cross_validation.KFold(y_data.shape[0], \ n_folds=10, shuffle=True, random_state=30) err = [] # to hold cross val errors # xgb instance xgb_model = xgb.XGBClassifier(n_estimators=_params['n_estimators'], \ max_depth=params['max_depth'], learning_rate=_params['learning_rate'], \ min_child_weight=_params['min_child_weight'], \ subsample=_params['subsample'], \ colsample_bytree=_params['colsample_bytree'], \ objective='multi:softmax', nthread=4) # cv for train_index, test_index in kf: xgb_model.fit(x_data[train_index], y_data[train_index], eval_metric='mlogloss') predictions = xgb_model.predict(x_data[test_index]) actuals = y_data[test_index] err.append(metrics.accuracy_score(actuals, predictions))