How to pivot Spark DataFrame?

后端 未结 10 2104
闹比i
闹比i 2020-11-21 06:43

I am starting to use Spark DataFrames and I need to be able to pivot the data to create multiple columns out of 1 column with multiple rows. There is built in functionality

10条回答
  •  礼貌的吻别
    2020-11-21 07:22

    Spark has been providing improvements to Pivoting the Spark DataFrame. A pivot function has been added to the Spark DataFrame API to Spark 1.6 version and it has a performance issue and that has been corrected in Spark 2.0

    however, if you are using lower version; note that pivot is a very expensive operation hence, it is recommended to provide column data (if known) as an argument to function as shown below.

    val countries = Seq("USA","China","Canada","Mexico")
    val pivotDF = df.groupBy("Product").pivot("Country", countries).sum("Amount")
    pivotDF.show()
    

    This has been explained detailed at Pivoting and Unpivoting Spark DataFrame

    Happy Learning !!

提交回复
热议问题