How to store custom objects in Dataset?

前端 未结 9 1025
别那么骄傲
别那么骄傲 2020-11-22 01:53

According to Introducing Spark Datasets:

As we look forward to Spark 2.0, we plan some exciting improvements to Datasets, specifically: ... Custom

9条回答
  •  挽巷
    挽巷 (楼主)
    2020-11-22 01:58

    For those who may in my situation I put my answer here, too.

    To be specific,

    1. I was reading 'Set typed data' from SQLContext. So original data format is DataFrame.

      val sample = spark.sqlContext.sql("select 1 as a, collect_set(1) as b limit 1") sample.show()

      +---+---+ | a| b| +---+---+ | 1|[1]| +---+---+

    2. Then convert it into RDD using rdd.map() with mutable.WrappedArray type.

      sample .rdd.map(r => (r.getInt(0), r.getAs[mutable.WrappedArray[Int]](1).toSet)) .collect() .foreach(println)

      Result:

      (1,Set(1))

提交回复
热议问题