Python scikit-learn: Cannot clone object... as the constructor does not seem to set parameter

15,686

Without seeing your code, it's hard to tell exactly what goes wrong, but you are violating a scikit-learn API convention here. The constructor in an estimator should only set attributes to the values the user passes as arguments. All computation should occur in fit, and if fit needs to store the result of a computation, it should do so in an attribute with a trailing underscore (_). This convention is what makes clone and meta-estimators such as GridSearchCV work.

(*) If you ever see an estimator in the main codebase that violates this rule: that would be a bug, and patches are welcome.

Share:
15,686

Related videos on Youtube

user1953384
Author by

user1953384

Updated on June 06, 2022

Comments

  • user1953384
    user1953384 almost 2 years

    I modified the BernoulliRBM class of scikit-learn to use groups of softmax visible units. In the process, I added an extra Numpy array visible_config as a class attribute which is initialized in the constructor as follows using:

    self.visible_config = np.cumsum(np.concatenate((np.asarray([0]),
                                    visible_config), axis=0))
    

    where visible_config is a Numpy array passed as an input to the constructor. The code runs without errors when I directly use the fit() function to train the model. However, when I use the GridSearchCV structure, I get the following error

    Cannot clone object SoftmaxRBM(batch_size=100, learning_rate=0.01, n_components=100, n_iter=100,
      random_state=0, verbose=True, visible_config=[ 0 21 42 63]), as the constructor does not seem to set parameter visible_config
    

    This seems to be a problem in the equality check between the instance of the class and its copy created by sklearn.base.clone because visible_config does not get copied correctly. I'm not sure how to fix this. It says in the documentation that sklearn.base.clone uses a deepcopy(), so shouldn't visible_config also get copied? Can someone please explain what I can try here? Thanks!

  • user1953384
    user1953384 almost 10 years
    You're right. Thanks! Removing the computation step and passing the pre-computed visible config into the constructor fixed the problem.
  • O.rka
    O.rka almost 7 years
    Anyone else trying this... make sure you're not cloning your model in the init. For example class NewAlgo(baseestimator, otherestimator): def __init__(sefl,model): self.model = clone(model) dont do that <--
  • Chris
    Chris almost 4 years
    I ran into this issue with xgboost and RandomizedSearchCV where my grid had a typo/deprecated parameter in it