问题
I am trying to graph data from a global table that contains information about inventory.
The table has a schema of (inventory amount, inventory sales, date, time, store)
Challenge is: every store logs 4-5 rows every day about inventory and sales and I have 4 stores.
My goal is to filter this table into a new table that includes only the last inventory amount and inventory sales of every day for every store.
Below is my input table:
I'd like to generate an output like below. Note that only the second entry for store 600 on 9/2 is kept.
Any suggestion is appreciated.
Thanks.
I have tried some filtering expressions but none worked worked for me. enter image description here
[Update]: So I tried the example from the answer below and I got this:
"Wednesday september 4th 2019, cell: 1new sill-fender cell3" is showing 18023; it should only show 1386
回答1:
Typically I would suggest doing this in M. It's a bit more verbose, but I think much clearer in what's happening. I've got a solution in M and in DAX below. For either solution, the approach is exactly the same:
- Build a helper structure of (Store, Date, MaxTime) which is made of unique pairs of (Store, Date) and has the latest time that exists for that pair.
- Filter the input table by the values in this helper table.
M solution first:
// Input - query with raw data - no work here:
let
Source = #table(
{"inventory amount", "Sale in dollar", "date", "time", "store"},
{
{54, 100, "2019-09-03", "09:55:00", 500},
{52, 200, "2019-09-04", "10:34:00", 500},
{49, 300, "2019-09-05", "09:43:00", 500},
{112, 200, "2019-09-02", "13:13:00", 600},
{111, 100, "2019-09-02", "13:19:00", 600},
{109, 200, "2019-09-03", "15:25:00", 600}
}
),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"inventory amount", Int64.Type}, {"Sale in dollar", Int64.Type}, {"date", type date}, {"time", type time}, {"store", Int64.Type}})
in
#"Changed Type"
// StoreDateMaxTime - this is our helper table. This is set not to load into the
// model, just used in PQ for data processing.
let
Source = Input,
#"Grouped Rows" = Table.Group(Source, {"date", "store"}, {{"MaxTime", each List.Max([time]), type time}})
in
#"Grouped Rows"
// Output - our final table
let
Source = Input,
#"Merged Queries" =
Table.NestedJoin(
Source, {"store", "date", "time"},
StoreDateMaxTime, {"store", "date", "MaxTime"},
"HelperDedupe",
JoinKind.Inner),
// Note that we join above, but don't use the results of the join in any way
// we only care about the filtering semantic of an inner join.
#"Removed Columns" = Table.RemoveColumns(#"Merged Queries",{"HelperDedupe"})
in
#"Removed Columns"
Here is the DAX version doing exactly the same thing:
Output2 =
VAR StoreDateMaxTime =
ADDCOLUMNS (
GROUPBY (
'Input',
'Input'[store],
'Input'[date]
),
"MaxTime", CALCULATE ( MAX ( 'Input'[time] ) )
)
RETURN
CALCULATETABLE (
'Input',
TREATAS ( StoreDateMaxTime, 'Input'[store], 'Input'[date], 'Input'[time] )
)
Note that you could also just load the full input table and define a measure as follows that will return the last inventory record:
FinalInventoryAmount =
VAR LastStoreDateTimes =
GENERATE(
VALUES ( 'Input'[store] ),
CALCULATETABLE (
TOPN (
1,
GROUPBY (
'Input',
'Input'[date],
'Input'[time]
),
'Input'[date], DESC,
'Input'[time], DESC
)
)
)
RETURN
CALCULATE (
SUM ( 'Input'[inventory amount] ),
TREATAS ( LastStoreDateTimes, 'Input'[store], 'Input'[date], 'Input'[time] )
)
This measure should look fairly similar to the pattern for making the table. We're again doing a largely similar operation. But to make it work well with filtering and totals, we need to do something slightly different:
- Start with only the stores in context:
VALUES ( 'Input'[Store] )
- Find for each store its latest ('Input'[Date], 'Input'[Time]) pair (the second arg to
GENERATE
) - Use those values from (2) to filter the fact table and only consider the latest row per store (the `CALCULATE).
You'll see that at the grand total level, this is returning 158, which is the last value for store 500 (the 49 from 9/5) and the last value for store 600 (the 109 from 9/3) to get 49 + 109 = 158
.
Here's a visual using the measure:
Edit: explaining more about how the measure works.
Here is a cross-tab (a Matrix visual), with date and time on the rows, and the store id on the columns.
Let's walk through this. On 9/2, there are no records for store 500. There are two records for store 600. We can see the individual times of those. The later is 111 at 13:19. You can see that the total for store 600 on 9/2 is the later amount, 111. The total for 9/2, across all stores is 111, the latest amount for store 600.
On 9/3, there is one record for store 500: 54 at 09:55. There is also a single record for store 600: 109 at 15:25. We can see these individual transactions at the time level. At the date subtotal, we see the amount for store 500 and 600 next to one another. The total column for 9/3 shows the addition of 54 + 109 = 163
, which is the latest amount for each store added together for a total across all stores.
9/4 and 9/5 are boring, each with only a single transaction for store 500.
The Total row shows values for each store and a grand total in the total column.
The total for a store is the last value for that store. For store 500, this is 49, from 9/5 at 09:43. For store 600, this is 109, from 9/3 at 15:25. The grand total (total row, total column) is these two values - the latest from each store - added together: 49 + 109 = 158
.
The totals only consider one value from each store - whatever the latest one is in context.
Edit2: for values that have multiple records at the same instant. Updating the measure only, but it should be obvious how to include the same value for the M and DAX table versions.
FinalInventoryAmount v2
VAR LastStoreDateTimes =
GENERATE(
VALUES ( 'Input'[store] ),
CALCULATETABLE (
SAMPLE ( // SAMPLE will ignore ties for those identical rows
1,
GROUPBY (
'Input',
'Input'[date],
'Input'[time],
'Input'[inventory amount]
),
'Input'[date], DESC,
'Input'[time], DESC,
'Input'[inventory amount], ASC // make sure we're getting the min
// amount value.
)
)
)
RETURN
CALCULATE (
SUM ( 'Input'[inventory amount] ),
// and below is the other difference
TREATAS (
LastStoreDateTimes,
'Input'[store],
'Input'[date],
'Input'[time],
'Input'[inventory amount] // we're also only including rows that tie
// on the min amount here.
)
)
If you don't care about totaling behavior, you could also just replace the SUM
with a MIN
to get the smallest amount at a (store, date, time) tuple.
来源:https://stackoverflow.com/questions/57805690/dax-filtering-expressions-last-date-of-every-day