Return Seq[Row] from Spark-Scala UDF

前端 未结 2 1258
半阙折子戏
半阙折子戏 2021-01-25 09:15

I am using Spark with Scala to do some data processing. I have XML data mapped to dataframe. I am passing a Row as parameter to the UDF and trying to extract two complex types o

2条回答
  •  无人共我
    2021-01-25 09:39

    I would like to suggest one alternative way which can be used if schema for object1 and object2 are different and you get to return the row. Basically to return row , you simply return a case class having the schema of Row objects which in this case is object1 and object2 which themselves seem to be rows

    so do the following

    case class Object1()
    
    case class Object2()
    
    case class Record(object1:Object1,object2:Object2)
    

    Now inside the UDF , you can create object1 and object2 using firstObject and secondObject

    then

    val record = Record(object1,object2)
    

    Then you can return the record

    In this you can return rows even if schema not same or some processing required.

    I know that this doesn't actually pertain to your question , but this question seemed a correct opportunity to tell about this concept.

提交回复
热议问题