Unable to serialize logistic regressing in mleap

拈花ヽ惹草 提交于 2019-12-12 04:49:19

问题


java.lang.AssertionError: assertion failed: This op only supports binary logistic regression

I am trying to serialize a spark pipeline in mleap.

I am using Tokenizer, HashingTF and LogisticRegression in my pipeline.

When I am trying to serialize my pipeline I get the above error. Here is the code I am using to serialize the pipeline -

    val pipeline = Pipeline(pipelineConfig)

    val model = pipeline.fit(data)

    (for(bf <- managed(BundleFile("jar:file:/tmp/abc.model.twitter.zip"))) yield {
        model.writeBundle.format(SerializationFormat.Json).save(bf).get
    }).tried.get

    sc.stop()

As per the documentation, LR is supported by mleap. So I am totally clueless about what I might be doing wrong here.


回答1:


yashdosi,

MLeap defaults to support for Spark 2.0 (sorry this isn't well documented). In 2.0, only binary logistic regression was supported. With the introduction of 2.1 there is multinomial logistic regression. Because MLeap is meant to support 2.0.0 and up, we have built in a mechanism for selecting which version of Spark you are using (currently MLeap supports 2.0 and 2.1, but defaults to 2.0).

Try adding this line to your application.conf file in your resources directory, it will let MLeap know to use the Spark 2.1 transformers when serializing:

// application.conf in src/main/resources
ml.combust.mleap.spark.registry.default = ${ml.combust.mleap.spark.registry.v21}


来源:https://stackoverflow.com/questions/44522572/unable-to-serialize-logistic-regressing-in-mleap

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!