how to save a scikit-learn pipline with keras regressor inside to disk?

前端 未结 2 1083
梦谈多话
梦谈多话 2021-02-01 20:59

I have a scikit-learn pipline with kerasRegressor in it:

estimators = [
    (\'standardize\', StandardScaler()),
    (\'mlp\', KerasRegressor(build_fn=baseline_m         


        
相关标签:
2条回答
  • 2021-02-01 21:15

    I struggled with the same problem as there are no direct ways to do this. Here is a hack which worked for me. I saved my pipeline into two files. The first file stored a pickled object of the sklearn pipeline and the second one was used to store the Keras model:

    ...
    from keras.models import load_model
    from sklearn.externals import joblib
    
    ...
    
    pipeline = Pipeline([
        ('scaler', StandardScaler()),
        ('estimator', KerasRegressor(build_model))
    ])
    
    pipeline.fit(X_train, y_train)
    
    # Save the Keras model first:
    pipeline.named_steps['estimator'].model.save('keras_model.h5')
    
    # This hack allows us to save the sklearn pipeline:
    pipeline.named_steps['estimator'].model = None
    
    # Finally, save the pipeline:
    joblib.dump(pipeline, 'sklearn_pipeline.pkl')
    
    del pipeline
    

    And here is how the model could be loaded back:

    # Load the pipeline first:
    pipeline = joblib.load('sklearn_pipeline.pkl')
    
    # Then, load the Keras model:
    pipeline.named_steps['estimator'].model = load_model('keras_model.h5')
    
    y_pred = pipeline.predict(X_test)
    
    0 讨论(0)
  • 2021-02-01 21:19

    Keras is not compatible with pickle out of the box. You can fix it if you are willing to monkey patch: https://github.com/tensorflow/tensorflow/pull/39609#issuecomment-683370566.

    You can also use the SciKeras library which does this for you and is a drop in replacement for KerasClassifier: https://github.com/adriangb/scikeras

    Disclosure: I am the author of SciKeras as well as that PR.

    0 讨论(0)
提交回复
热议问题