How to read a nested collection in Spark

前端 未结 4 1270
借酒劲吻你
借酒劲吻你 2021-01-31 10:53

I have a parquet table with one of the columns being

, array>

Can run queries against this table in

4条回答
  •  佛祖请我去吃肉
    2021-01-31 11:25

    Another approach would be using pattern matching like this:

    val rdd: RDD[(String, List[(String, String)]] = dataFrame.map(_.toSeq.toList match { 
      case List(key: String, inners: Seq[Row]) => key -> inners.map(_.toSeq.toList match {
        case List(a:String, b: String) => (a, b)
      }).toList
    })
    

    You can pattern match directly on Row but it is likely to fail for a few reasons.

提交回复
热议问题