Spark - Sum of row values

前端 未结 5 1117
囚心锁ツ
囚心锁ツ 2020-12-09 13:57

I have the following DataFrame:

January | February | March
-----------------------------
  10    |    10    |  10
  20    |    20    |  20
  50    |    50            


        
5条回答
  •  有刺的猬
    2020-12-09 14:36

    You were very close with this:

    val newDf: DataFrame = df.select(colsToSum.map(col):_*).foreach ...
    

    Instead, try this:

    val newDf = df.select(colsToSum.map(col).reduce((c1, c2) => c1 + c2) as "sum")
    

    I think this is the best of the the answers, because it is as fast as the answer with the hard-coded SQL query, and as convenient as the one that uses the UDF. It's the best of both worlds -- and I didn't even add a full line of code!

提交回复
热议问题