Spark Pipeline error

后端 未结 1 1725
感情败类
感情败类 2021-01-13 02:26

I am trying to run a Multinomial Logistic Regression model

from pyspark.sql import SparkSession
spark = SparkSession.builder.appName(\'prepare_data\').getOrC         


        
相关标签:
1条回答
  • 2021-01-13 03:14

    You need to make sure there are no missing values in your data -- that's why you get the NullPointerException. Also, make sure that all your input features to the VectorAssembler are numeric.

    BTW, when you create encoder you might consider specifying the inputCol as StringIndexer.getOuputCol().

    0 讨论(0)
提交回复
热议问题