Row aggregations in Scala

前端未结

关注

 1  336

I am looking for a way to get a new column in a data frame in Scala that calculates the min/max of the values in col1, col2,

相关标签:

1条回答

你的背包

2021-01-28 08:13

Porting this Python answer by user6910411

import org.apache.spark.sql.functions._

val df = Seq(
  (1, 3, 0, 9, "a", "b", "c")
).toDF("col1", "col2", "col3", "col4", "col5", "col6", "Col7")

val cols =  Seq("col1", "col2", "col3", "col4")

val rowMax = greatest(
  cols map col: _*
).alias("max")

val rowMin = least(
  cols map col: _*
).alias("min")

df.select($"*", rowMin, rowMax).show

// +----+----+----+----+----+----+----+---+---+
// |col1|col2|col3|col4|col5|col6|Col7|min|max|
// +----+----+----+----+----+----+----+---+---+
// |   1|   3|   0|   9|   a|   b|   c|0.0|9.0|
// +----+----+----+----+----+----+----+---+---+

0 讨论(0)