How to explode an array into multiple columns in Spark

后端 未结 2 545
猫巷女王i
猫巷女王i 2020-12-01 15:15

I have a spark dataframe looks like:

id   DataArray
a    array(3,2,1)
b    array(4,2,1)     
c    array(8,6,1)
d    array(8,2,4)

I want to

相关标签:
2条回答
  • 2020-12-01 15:35

    You can use foldLeft to add each columnn fron DataArray

    make a list of column names that you want to add

    val columns = List("col1", "col2", "col3")
    
    columns.zipWithIndex.foldLeft(df) {
      (memodDF, column) => {
        memodDF.withColumn(column._1, col("dataArray")(column._2))
      }
    }
      .drop("DataArray")
    

    Hope this helps!

    0 讨论(0)
  • 2020-12-01 15:43

    Use apply:

    import org.apache.spark.sql.functions.col
    
    df.select(
      col("id") +: (0 until 3).map(i => col("DataArray")(i).alias(s"col$i")): _*
    )
    
    0 讨论(0)
提交回复
热议问题