问题
How to select row with max value per category in M of Power BI. Suppose we have table:
+----------+-------+------------+
| Category | Value | Date |
+----------+-------+------------+
| apples | 1 | 2018-07-01 |
| apples | 2 | 2018-07-02 |
| apples | 3 | 2018-07-03 |
| bananas | 7 | 2018-07-04 |
| bananas | 8 | 2018-07-05 |
| bananas | 9 | 2018-07-06 |
+----------+-------+------------+
Desired results are:
+----------+-------+------------+
| Category | Value | Date |
+----------+-------+------------+
| apples | 3 | 2018-07-03 |
| bananas | 9 | 2018-07-06 |
+----------+-------+------------+
Here is a start table for PBI:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WSiwoyEktVtJRMgRiIwNDC10Dc10DQ6VYHSQ5I2Q5I1Q5Y2Q5Y7BcUmIeEIIkzZElTdAkLZAlTdEkLZElzZRiYwE=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Category = _t, Value = _t, Date = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Category", type text}, {"Value", Int64.Type}, {"Date", type date}})
in
#"Changed Type"
I wonder if there is a way to come to desired results in subsequent steps within only one table, by adding some magic column IsMax:
+----------+-------+------------+-------+
| Category | Value | Date | IsMax |
+----------+-------+------------+-------+
| apples | 1 | 2018-07-01 | 0 |
| apples | 2 | 2018-07-02 | 0 |
| apples | 3 | 2018-07-03 | 1 |
| bananas | 7 | 2018-07-04 | 0 |
| bananas | 8 | 2018-07-05 | 0 |
| bananas | 9 | 2018-07-06 | 1 |
+----------+-------+------------+-------+
回答1:
Doing a basic Group By in the Power Query Editor (group by Category
and take the max over Value
) gets you this table:
+----------+-------+
| Category | Value |
+----------+-------+
| apples | 3 |
| bananas | 9 |
+----------+-------+
Add a custom column IsMax
that is simply the value 1
to this table and then merge (left outer join) it with your original table matching on both Category
and Value
. Finally, expand the IsMax
column to get your desired table, except with null
instead of 0
. You can replace the null
values if you choose.
Here's the M code for all those steps:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WSiwoyEktVtJRMgRiIwNDC10Dc10DQ6VYHSQ5I2Q5I1Q5Y2Q5Y7BcUmIeEIIkzZElTdAkLZAlTdEkLZElzZRiYwE=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Category = _t, Value = _t, Date = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Value", Int64.Type}, {"Date", type date}, {"Category", type text}}),
#"Grouped Rows" = Table.Group(#"Changed Type", {"Category"}, {{"Value", each List.Max([Value]), Int64.Type}}),
#"Added Custom" = Table.AddColumn(#"Grouped Rows", "IsMax", each 1, Int64.Type),
#"Merged Queries" = Table.NestedJoin(#"Changed Type",{"Category", "Value"},#"Added Custom",{"Category", "Value"},"Added Custom",JoinKind.LeftOuter),
#"Expanded Added Custom" = Table.ExpandTableColumn(#"Merged Queries", "Added Custom", {"IsMax"}, {"IsMax"})
in
#"Expanded Added Custom"
回答2:
I ended up getting MAX
per category through index
. Idea described here: https://stackoverflow.com/a/51498237/1903793
Approach #1 is one-liner snipped in R transformation:
library(dplyr)
output <- dataset %>% group_by(Category) %>% mutate(row_no_by_category = row_number(desc(Date)))
Approach #2, done completely in PBI:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WSiwoyEktVtJRMgRiIwNDC10Dc10DQ6VYHSQ5I2Q5I1Q5Y2Q5Y7BcUmIeEIIkzZElTdAkLZAlTdEkLZElzZRiYwE=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Category = _t, Value = _t, Date = _t]),
#"Grouped rows" = Table.Group(Source, {"Category"}, {{"NiceTable", each Table.AddIndexColumn(Table.Sort(_,{{"Date", Order.Descending}} ), "Index",1,1), type table}} ),
#"Expanded NiceTable" = Table.ExpandTableColumn(#"Grouped rows", "NiceTable", {"Value", "Date", "Index"}, {"Value", "Date", "Index"}),
#"Filtered Rows" = Table.SelectRows(#"Expanded NiceTable", each ([Index] = 1))
in
#"Filtered Rows"
来源:https://stackoverflow.com/questions/51437962/select-row-with-max-value-per-category-power-bi