Spark 2.1.1: How to predict topics in unseen documents on already trained LDA model in Spark 2.1.1?
问题 I am training an LDA model in pyspark (spark 2.1.1) on a customers review dataset. Now based on that model I want to predict the topics in the new unseen text. I am using the following code to make the model from pyspark import SparkConf, SparkContext from pyspark.sql import SparkSession from pyspark.sql import SQLContext, Row from pyspark.ml.feature import CountVectorizer from pyspark.ml.feature import HashingTF, IDF, Tokenizer, CountVectorizer, StopWordsRemover from pyspark.mllib.clustering