How to pivot Spark DataFrame?

后端 未结 10 2146
闹比i
闹比i 2020-11-21 06:43

I am starting to use Spark DataFrames and I need to be able to pivot the data to create multiple columns out of 1 column with multiple rows. There is built in functionality

10条回答
  •  梦毁少年i
    2020-11-21 07:17

    There is a SIMPLE method for pivoting :

      id  tag  value
      1   US    50
      1   UK    100
      1   Can   125
      2   US    75
      2   UK    150
      2   Can   175
    
      import sparkSession.implicits._
    
      val data = Seq(
        (1,"US",50),
        (1,"UK",100),
        (1,"Can",125),
        (2,"US",75),
        (2,"UK",150),
        (2,"Can",175),
      )
    
      val dataFrame = data.toDF("id","tag","value")
    
      val df2 = dataFrame
                        .groupBy("id")
                        .pivot("tag")
                        .max("value")
      df2.show()
    
    +---+---+---+---+
    | id|Can| UK| US|
    +---+---+---+---+
    |  1|125|100| 50|
    |  2|175|150| 75|
    +---+---+---+---+
    

提交回复
热议问题