Select row with MAX value per category Power BI

天涯浪子 提交于 2019-12-30 20:01:17

问题


How to select row with max value per category in M of Power BI. Suppose we have table:

+----------+-------+------------+
| Category | Value |    Date    |
+----------+-------+------------+
| apples   |     1 | 2018-07-01 |
| apples   |     2 | 2018-07-02 |
| apples   |     3 | 2018-07-03 |
| bananas  |     7 | 2018-07-04 |
| bananas  |     8 | 2018-07-05 |
| bananas  |     9 | 2018-07-06 |
+----------+-------+------------+

Desired results are:

+----------+-------+------------+
| Category | Value |    Date    |
+----------+-------+------------+
| apples   |     3 | 2018-07-03 |
| bananas  |     9 | 2018-07-06 |
+----------+-------+------------+

Here is a start table for PBI:

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WSiwoyEktVtJRMgRiIwNDC10Dc10DQ6VYHSQ5I2Q5I1Q5Y2Q5Y7BcUmIeEIIkzZElTdAkLZAlTdEkLZElzZRiYwE=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Category = _t, Value = _t, Date = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Category", type text}, {"Value", Int64.Type}, {"Date", type date}})
in
    #"Changed Type"

I wonder if there is a way to come to desired results in subsequent steps within only one table, by adding some magic column IsMax:

+----------+-------+------------+-------+
| Category | Value |    Date    | IsMax |
+----------+-------+------------+-------+
| apples   |     1 | 2018-07-01 |     0 |
| apples   |     2 | 2018-07-02 |     0 |
| apples   |     3 | 2018-07-03 |     1 |
| bananas  |     7 | 2018-07-04 |     0 |
| bananas  |     8 | 2018-07-05 |     0 |
| bananas  |     9 | 2018-07-06 |     1 |
+----------+-------+------------+-------+

回答1:


Doing a basic Group By in the Power Query Editor (group by Category and take the max over Value) gets you this table:

+----------+-------+
| Category | Value |
+----------+-------+
| apples   |     3 |
| bananas  |     9 |
+----------+-------+

Add a custom column IsMax that is simply the value 1 to this table and then merge (left outer join) it with your original table matching on both Category and Value. Finally, expand the IsMax column to get your desired table, except with null instead of 0. You can replace the null values if you choose.

Here's the M code for all those steps:

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WSiwoyEktVtJRMgRiIwNDC10Dc10DQ6VYHSQ5I2Q5I1Q5Y2Q5Y7BcUmIeEIIkzZElTdAkLZAlTdEkLZElzZRiYwE=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Category = _t, Value = _t, Date = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Value", Int64.Type}, {"Date", type date}, {"Category", type text}}),
    #"Grouped Rows" = Table.Group(#"Changed Type", {"Category"}, {{"Value", each List.Max([Value]), Int64.Type}}),
    #"Added Custom" = Table.AddColumn(#"Grouped Rows", "IsMax", each 1, Int64.Type),
    #"Merged Queries" = Table.NestedJoin(#"Changed Type",{"Category", "Value"},#"Added Custom",{"Category", "Value"},"Added Custom",JoinKind.LeftOuter),
    #"Expanded Added Custom" = Table.ExpandTableColumn(#"Merged Queries", "Added Custom", {"IsMax"}, {"IsMax"})
in
    #"Expanded Added Custom"



回答2:


I ended up getting MAX per category through index. Idea described here: https://stackoverflow.com/a/51498237/1903793

Approach #1 is one-liner snipped in R transformation:

library(dplyr)
output <- dataset %>% group_by(Category) %>% mutate(row_no_by_category = row_number(desc(Date)))

Approach #2, done completely in PBI:

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WSiwoyEktVtJRMgRiIwNDC10Dc10DQ6VYHSQ5I2Q5I1Q5Y2Q5Y7BcUmIeEIIkzZElTdAkLZAlTdEkLZElzZRiYwE=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Category = _t, Value = _t, Date = _t]),
    #"Grouped rows" = Table.Group(Source, {"Category"}, {{"NiceTable", each Table.AddIndexColumn(Table.Sort(_,{{"Date", Order.Descending}} ), "Index",1,1), type table}} ),
    #"Expanded NiceTable" = Table.ExpandTableColumn(#"Grouped rows", "NiceTable", {"Value", "Date", "Index"}, {"Value", "Date", "Index"}),
    #"Filtered Rows" = Table.SelectRows(#"Expanded NiceTable", each ([Index] = 1))
in
    #"Filtered Rows"


来源:https://stackoverflow.com/questions/51437962/select-row-with-max-value-per-category-power-bi

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!