Return Seq[Row] from Spark-Scala UDF

前端 未结 2 1253
半阙折子戏
半阙折子戏 2021-01-25 09:15

I am using Spark with Scala to do some data processing. I have XML data mapped to dataframe. I am passing a Row as parameter to the UDF and trying to extract two complex types o

2条回答
  •  迷失自我
    2021-01-25 09:37

    UDF cannot return Row objects. Return type has to be one of the types enumerated in the column Value type in Scala in the Data Types table.

    Good news is there should be no need for UDF here. If Object1 and Object2 have the same schema (it wouldn't work otherwise anyway) you can use array function:

    import org.apache.spark.sql.functions._
    
    df.select(array(col("Object1"), col("Object2"))
    

    or

    df.select(array(col("path.to.Object1"), col("path.to.Object2"))
    

    if Object1 and Object2 are not top level columns.

提交回复
热议问题