Spark combine columns as nested array

后端 未结 1 491
灰色年华
灰色年华 2021-01-03 11:44

How can I combine columns in spark as a nested array?

val inputSmall = Seq(
    (\"A\", 0.3, \"B\", 0.25),
    (\"A\", 0.3, \"g\", 0.4),
    (\"d\", 0.0, \"f         


        
相关标签:
1条回答
  • 2021-01-03 12:29

    If you want to combine multiple columns into a new column of ArrayType, you can use the array function:

    import org.apache.spark.sql.functions._
    val result = inputSmall.withColumn("combined", array($"transformedCol1", $"transformedCol2"))
    result.show()
    
    +-------+---------------+-------+---------------+-----------+
    |column1|transformedCol1|column2|transformedCol2|   combined|
    +-------+---------------+-------+---------------+-----------+
    |      A|            0.3|      B|           0.25|[0.3, 0.25]|
    |      A|            0.3|      g|            0.4| [0.3, 0.4]|
    |      d|            0.0|      f|            0.1| [0.0, 0.1]|
    |      d|            0.0|      d|            0.7| [0.0, 0.7]|
    |      A|            0.3|      d|            0.7| [0.3, 0.7]|
    |      d|            0.0|      g|            0.4| [0.0, 0.4]|
    |      c|            0.2|      B|           0.25|[0.2, 0.25]|
    +-------+---------------+-------+---------------+-----------+
    
    0 讨论(0)
提交回复
热议问题