问题
java.lang.AssertionError: assertion failed: This op only supports binary logistic regression
I am trying to serialize a spark pipeline in mleap.
I am using Tokenizer, HashingTF and LogisticRegression in my pipeline.
When I am trying to serialize my pipeline I get the above error. Here is the code I am using to serialize the pipeline -
val pipeline = Pipeline(pipelineConfig)
val model = pipeline.fit(data)
(for(bf <- managed(BundleFile("jar:file:/tmp/abc.model.twitter.zip"))) yield {
model.writeBundle.format(SerializationFormat.Json).save(bf).get
}).tried.get
sc.stop()
As per the documentation, LR is supported by mleap. So I am totally clueless about what I might be doing wrong here.
回答1:
yashdosi,
MLeap defaults to support for Spark 2.0 (sorry this isn't well documented). In 2.0, only binary logistic regression was supported. With the introduction of 2.1 there is multinomial logistic regression. Because MLeap is meant to support 2.0.0 and up, we have built in a mechanism for selecting which version of Spark you are using (currently MLeap supports 2.0 and 2.1, but defaults to 2.0).
Try adding this line to your application.conf
file in your resources directory, it will let MLeap know to use the Spark 2.1 transformers when serializing:
// application.conf in src/main/resources
ml.combust.mleap.spark.registry.default = ${ml.combust.mleap.spark.registry.v21}
来源:https://stackoverflow.com/questions/44522572/unable-to-serialize-logistic-regressing-in-mleap