Spark dataframe: Pivot and Group based on columns

前端 未结 2 1379
执笔经年
执笔经年 2021-01-19 15:10

I have input dataframe as below with id, app, and customer

Input dataframe

+--------------------+-----+---------+
|                          


        
2条回答
  •  走了就别回头了
    2021-01-19 15:41

    You can use collect_list if you can bear with an empty List at cells where it should be zero:

    df.groupBy("id").pivot("app").agg(collect_list("customer")).show
    +---+--------+----+--------+
    | id|      bc|  fe|      fw|
    +---+--------+----+--------+
    |id3|[TR, WM]|  []|      []|
    |id1|      []|[WM]|[CS, WM]|
    |id2|      []|  []|    [CS]|
    +---+--------+----+--------+
    

提交回复
热议问题