问题
I understand that this is a duplicate question which was asked here saving pipeline model in pyspark 1.6 but there is still no definite answer to it. Can anyone please suggest anything?
joblib or cPickle doesn't work as it gives the same error which is given in the previous link. Is there a way to save the pipeline in PySpark 1.6 or there isn't? The questions that I saw regarding model persistence were mainly related to persisting ML models. Saving a pipeline is the altogether differnt issue. Is there any hack that is available? I need the Python (v2.7) implementation. Any help is appreciated. I am using the RandomForestClassifier from pyspark.ml as a classification algorithm and my environment is Spark 1.6 and Python 2.7 (if this is of any help)
来源:https://stackoverflow.com/questions/43111624/is-there-a-way-to-persist-or-save-the-pipeline-model-in-pyspark-1-6