How can I combine columns in spark as a nested array?
val inputSmall = Seq(
(\"A\", 0.3, \"B\", 0.25),
(\"A\", 0.3, \"g\", 0.4),
(\"d\", 0.0, \"f
If you want to combine multiple columns into a new column of ArrayType, you can use the array function:
import org.apache.spark.sql.functions._
val result = inputSmall.withColumn("combined", array($"transformedCol1", $"transformedCol2"))
result.show()
+-------+---------------+-------+---------------+-----------+
|column1|transformedCol1|column2|transformedCol2| combined|
+-------+---------------+-------+---------------+-----------+
| A| 0.3| B| 0.25|[0.3, 0.25]|
| A| 0.3| g| 0.4| [0.3, 0.4]|
| d| 0.0| f| 0.1| [0.0, 0.1]|
| d| 0.0| d| 0.7| [0.0, 0.7]|
| A| 0.3| d| 0.7| [0.3, 0.7]|
| d| 0.0| g| 0.4| [0.0, 0.4]|
| c| 0.2| B| 0.25|[0.2, 0.25]|
+-------+---------------+-------+---------------+-----------+