How to split multi-value column into separate rows using typed Dataset?

后端 未结 3 1930
無奈伤痛
無奈伤痛 2021-01-05 07:17

I am facing an issue of how to split a multi-value column, i.e. List[String], into separate rows.

The initial dataset has following types: Dataset

3条回答
  •  不知归路
    2021-01-05 07:50

    You can use explode:

    df.withColumn("property", explode($"property"))
    

    Example:

    val df = Seq((1, List("a", "b"))).toDF("A", "B")   
    // df: org.apache.spark.sql.DataFrame = [A: int, B: array]
    
    df.withColumn("B", explode($"B")).show
    +---+---+
    |  A|  B|
    +---+---+
    |  1|  a|
    |  1|  b|
    +---+---+
    

提交回复
热议问题