Spark dataframe: Pivot and Group based on columns

前端 未结 2 1378
执笔经年
执笔经年 2021-01-19 15:10

I have input dataframe as below with id, app, and customer

Input dataframe

+--------------------+-----+---------+
|                          


        
相关标签:
2条回答
  • 2021-01-19 15:22

    Using CONCAT_WS we can explode array and can remove the square brackets.

    df.groupBy("id").pivot("app").agg(concat_ws(",",collect_list("customer")))
    
    0 讨论(0)
  • 2021-01-19 15:41

    You can use collect_list if you can bear with an empty List at cells where it should be zero:

    df.groupBy("id").pivot("app").agg(collect_list("customer")).show
    +---+--------+----+--------+
    | id|      bc|  fe|      fw|
    +---+--------+----+--------+
    |id3|[TR, WM]|  []|      []|
    |id1|      []|[WM]|[CS, WM]|
    |id2|      []|  []|    [CS]|
    +---+--------+----+--------+
    
    0 讨论(0)
提交回复
热议问题