Random Forest tuning with RandomizedSearchCV

10,824

Add the 'scoring'-parameter to RandomizedSearchCV.

RandomizedSearchCV(scoring="neg_mean_squared_error", ...

Alternative options can be found in the docs

With this, you can print the RMSE for each parameter set, along with the parameter set:

cv_results = rf_random.cv_results_
for mean_score, params in zip(cv_results["mean_test_score"], cvres["params"]):
    print(np.sqrt(-mean_score), params)
Share:
10,824
raffa_sa
Author by

raffa_sa

Updated on September 16, 2022

Comments

  • raffa_sa
    raffa_sa over 1 year

    I have a few questions concerning Randomized grid search in a Random Forest Regression Model. My parameter grid looks like this:

    random_grid = {'bootstrap': [True, False],
                   'max_depth': [10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110, None],
                   'max_features': ['auto', 'sqrt'],
                   'min_samples_leaf': [1, 2, 4],
                   'min_samples_split': [2, 5, 10],
                   'n_estimators': [130, 180, 230]}
    

    and my code for the RandomizedSearchCV like this:

    # Use the random grid to search for best hyperparameters
    # First create the base model to tune
    from sklearn.ensemble import RandomForestRegressor
    rf = RandomForestRegressor()
    # Random search of parameters, using 3 fold cross validation, 
    # search across 100 different combinations, and use all available cores
    rf_random = RandomizedSearchCV(estimator = rf, param_distributions = random_grid, n_iter = 100, cv = 3, verbose=2, random_state=42, n_jobs = -1)
    # Fit the random search model
    rf_random.fit(X_1, Y)
    

    is there any way to calculate the Root mean square at each parameter set? This would be more interesting to me as the R^2 score? If I now want to get the best parameter set, as printed underneath i would also use the lowest RMSE score. Is there any way to do that?

    rf_random.best_params_
    rf_random.best_score_
    rf_random.best_estimator_
    

    thank you, R

  • raffa_sa
    raffa_sa over 5 years
    so the RandomizedSearchCV should now internally work with the RMSE right? Then i don't understand my result. I get for rf_random.best_score_ this result -13684.3. RMSE can't be negative normally? @Tobi
  • Tobi
    Tobi over 5 years
    You are almost correct. It is working with the MSE (without the Square). However, for Grid/Randomized/...SearchCV it has to be the negative MSE. And that is why I used np.sqrt( - mean_score). An explanation for the negation is given here: stackoverflow.com/questions/21050110/….