How to use GridSearchCV output for a scikit prediction?
13,813
gs.predict(X_test)
is equivalent to gs.best_estimator_.predict(X_test)
. Using either, X_test
will be passed through your entire pipeline and it will return the predictions.
gs.best_estimator_.named_steps['clf'].predict()
, however is only the last phase of the pipeline. To use it, the feature selection step must already have been performed. This would only work if you have previously run your data through gs.best_estimator_.named_steps['fs'].transform()
Three equivalent methods for generating predictions are shown below:
Using gs
directly.
pred = gs.predict(X_test)
Using best_estimator_
.
pred = gs.best_estimator_.predict(X_test)
Calling each step in the pipeline individual.
X_test_fs = gs.best_estimator_.named_steps['fs'].transform(X_test)
pred = gs.best_estimator_.named_steps['clf'].predict(X_test_fs)
Related videos on Youtube
Author by
user308827
Updated on June 04, 2022Comments
-
user308827 12 months
In the following code:
# Load dataset iris = datasets.load_iris() X, y = iris.data, iris.target rf_feature_imp = RandomForestClassifier(100) feat_selection = SelectFromModel(rf_feature_imp, threshold=0.5) clf = RandomForestClassifier(5000) model = Pipeline([ ('fs', feat_selection), ('clf', clf), ]) params = { 'fs__threshold': [0.5, 0.3, 0.7], 'fs__estimator__max_features': ['auto', 'sqrt', 'log2'], 'clf__max_features': ['auto', 'sqrt', 'log2'], } gs = GridSearchCV(model, params, ...) gs.fit(X,y)
What should be used for a prediction?
gs
?gs.best_estimator_
? orgs.best_estimator_.named_steps['clf']
?
What is the difference between these 3?
-
rajesh almost 2 yearsThank you very much! Is there an offcial doc saying the same?