Flatmap on dataframe

前端 未结 1 1121
小蘑菇
小蘑菇 2021-01-06 18:31

What is the best way to preform a flatMap on a DataFrame in spark? From searching around and doing some testing, I have come up with two different

相关标签:
1条回答
  • 2021-01-06 19:15

    You can create a second dataframe from your map RDD:

    val mapDF = Map("a" -> List("c","d","e"), "b" -> List("f","g","h")).toList.toDF("key", "value")
    

    Then do the join and apply the explode function:

    val joinedDF = df.join(mapDF, df("x") === mapDF("key"), "inner")
      .select("value", "y")
      .withColumn("value", explode($"value"))
    

    And you get the solution.

    joinedDF.show()
    
    0 讨论(0)
提交回复
热议问题