What are the pitfalls of using Dill to serialise scikit-learn/statsmodels models?

前端 未结 3 419
灰色年华
灰色年华 2021-01-31 09:24

I need to serialise scikit-learn/statsmodels models such that all the dependencies (code + data) are packaged in an artefact and this artefact can be used to initialise the mode

3条回答
  •  深忆病人
    2021-01-31 09:56

    I package gaussian process (GP) from scikit-learn using pickle.

    The primary reason is because the GP takes long time to build and loads much faster using pickle. So in my code initialization I check whether the data files for model got updated and re-generate the model if necessary, otherwise just de-serialize it from pickle!

    I would use pickle, dill, cloudpickle in the respective order.

    Note that pickle includes protocol keyword argument and some values can speed up and reduce memory usage significantly! Finally I wrap pickle code with compression from CPython STL if necessary.

提交回复
热议问题