Non-integer ids in Spark MLlib ALS

后端 未结 1 676
一个人的身影
一个人的身影 2020-12-21 23:03

I\'d like to use

val ratings = data.map(_.split(\',\') match {
      case Array(user,item,rate)
      =>
        Rating(user.toInt,item.toInt,rate.toFloa         


        
相关标签:
1条回答
  • 2020-12-21 23:53

    You can use one of ML implementations which support Long labels. RDD version it is significantly less user friendly compared to other implementations:

    import org.apache.spark.ml.recommendation.ALS
    import org.apache.spark.ml.recommendation.ALS.Rating
    
    val ratings = sc.parallelize(Seq(Rating(1L, 2L, 3.0f), Rating(2L, 3L, 5.0f)))
    
    val (userFactors, itemFactors) = ALS.train(ratings)
    

    and returns only factors but DataFrame version returns a model:

    val ratingsDF= ratings.toDF
    
    val alsModel = new ALS().fit(ratingsDF)
    
    0 讨论(0)
提交回复
热议问题