Given the code below, how would I go about adding a count column? (e.g. .count(\"*\").as(\"count\"))
Final output to look like something like this:
+
You can simply append count("*").as("count")
to aggExprs.tail
in your agg
, as shown below:
df.
groupBy("id").agg(aggExprs.head, aggExprs.tail :+ count("*").as("count"): _*).
show
// +---+------+------+-----------------------------+-----+
// | id|sum(d)|max(b)|concat_ws(,, collect_list(s))|count|
// +---+------+------+-----------------------------+-----+
// | 1| 1.0| true| a| 1|
// | 3| 3.0| false| b| 1|
// | 2| 4.0| false| b,c| 2|
// +---+------+------+-----------------------------+-----+