Dax filtering expressions, last date of every day

前端 未结 1 358
南旧
南旧 2021-01-27 10:30

I am trying to graph data from a global table that contains information about inventory.

The table has a schema of (inventory amount, inventory sales, date, time, store)

相关标签:
1条回答
  • 2021-01-27 11:21

    Typically I would suggest doing this in M. It's a bit more verbose, but I think much clearer in what's happening. I've got a solution in M and in DAX below. For either solution, the approach is exactly the same:

    1. Build a helper structure of (Store, Date, MaxTime) which is made of unique pairs of (Store, Date) and has the latest time that exists for that pair.
    2. Filter the input table by the values in this helper table.

    M solution first:

    // Input - query with raw data - no work here:
    let
        Source = #table(
            {"inventory amount", "Sale in dollar", "date", "time", "store"},
            {
                {54, 100, "2019-09-03", "09:55:00", 500},
                {52, 200, "2019-09-04", "10:34:00", 500},
                {49, 300, "2019-09-05", "09:43:00", 500},
                {112, 200, "2019-09-02", "13:13:00", 600},
                {111, 100, "2019-09-02", "13:19:00", 600},
                {109, 200, "2019-09-03", "15:25:00", 600}
            }
        ),
        #"Changed Type" = Table.TransformColumnTypes(Source,{{"inventory amount", Int64.Type}, {"Sale in dollar", Int64.Type}, {"date", type date}, {"time", type time}, {"store", Int64.Type}})
    in
        #"Changed Type"
    
    // StoreDateMaxTime - this is our helper table. This is set not to load into the
    // model, just used in PQ for data processing.
    let
        Source = Input,
        #"Grouped Rows" = Table.Group(Source, {"date", "store"}, {{"MaxTime", each List.Max([time]), type time}})
    in
        #"Grouped Rows"
    
    // Output - our final table
    let
        Source = Input,
        #"Merged Queries" =
            Table.NestedJoin(
                Source, {"store", "date", "time"},
                StoreDateMaxTime, {"store", "date", "MaxTime"},
                "HelperDedupe",
                JoinKind.Inner),
        // Note that we join above, but don't use the results of the join in any way
        // we only care about the filtering semantic of an inner join.
        #"Removed Columns" = Table.RemoveColumns(#"Merged Queries",{"HelperDedupe"})
    in
        #"Removed Columns"
    

    Here is the DAX version doing exactly the same thing:

    Output2 = 
    VAR StoreDateMaxTime =
        ADDCOLUMNS (
            GROUPBY (
                'Input',
                'Input'[store],
                'Input'[date]
            ),
            "MaxTime", CALCULATE ( MAX ( 'Input'[time] ) )
        )
    RETURN
        CALCULATETABLE (
            'Input',
            TREATAS ( StoreDateMaxTime, 'Input'[store], 'Input'[date], 'Input'[time] )
        )
    

    Note that you could also just load the full input table and define a measure as follows that will return the last inventory record:

    FinalInventoryAmount = 
    VAR LastStoreDateTimes =
        GENERATE(
            VALUES ( 'Input'[store] ),
            CALCULATETABLE (
                TOPN (
                    1,
                    GROUPBY (
                        'Input',
                        'Input'[date],
                        'Input'[time]
                    ),
                    'Input'[date], DESC,
                    'Input'[time], DESC
                )
            )
        )
    RETURN
        CALCULATE (
            SUM ( 'Input'[inventory amount] ),
            TREATAS ( LastStoreDateTimes, 'Input'[store], 'Input'[date], 'Input'[time] )
        )
    

    This measure should look fairly similar to the pattern for making the table. We're again doing a largely similar operation. But to make it work well with filtering and totals, we need to do something slightly different:

    1. Start with only the stores in context: VALUES ( 'Input'[Store] )
    2. Find for each store its latest ('Input'[Date], 'Input'[Time]) pair (the second arg to GENERATE)
    3. Use those values from (2) to filter the fact table and only consider the latest row per store (the `CALCULATE).

    You'll see that at the grand total level, this is returning 158, which is the last value for store 500 (the 49 from 9/5) and the last value for store 600 (the 109 from 9/3) to get 49 + 109 = 158.

    Here's a visual using the measure:

    Edit: explaining more about how the measure works.

    Here is a cross-tab (a Matrix visual), with date and time on the rows, and the store id on the columns.

    Let's walk through this. On 9/2, there are no records for store 500. There are two records for store 600. We can see the individual times of those. The later is 111 at 13:19. You can see that the total for store 600 on 9/2 is the later amount, 111. The total for 9/2, across all stores is 111, the latest amount for store 600.

    On 9/3, there is one record for store 500: 54 at 09:55. There is also a single record for store 600: 109 at 15:25. We can see these individual transactions at the time level. At the date subtotal, we see the amount for store 500 and 600 next to one another. The total column for 9/3 shows the addition of 54 + 109 = 163, which is the latest amount for each store added together for a total across all stores.

    9/4 and 9/5 are boring, each with only a single transaction for store 500.

    The Total row shows values for each store and a grand total in the total column. The total for a store is the last value for that store. For store 500, this is 49, from 9/5 at 09:43. For store 600, this is 109, from 9/3 at 15:25. The grand total (total row, total column) is these two values - the latest from each store - added together: 49 + 109 = 158.

    The totals only consider one value from each store - whatever the latest one is in context.

    Edit2: for values that have multiple records at the same instant. Updating the measure only, but it should be obvious how to include the same value for the M and DAX table versions.

    FinalInventoryAmount v2
    VAR LastStoreDateTimes =
        GENERATE(
            VALUES ( 'Input'[store] ),
            CALCULATETABLE (
                SAMPLE ( // SAMPLE will ignore ties for those identical rows
                    1,
                    GROUPBY (
                        'Input',
                        'Input'[date],
                        'Input'[time],
                        'Input'[inventory amount]
                    ),
                    'Input'[date], DESC,
                    'Input'[time], DESC,
                    'Input'[inventory amount], ASC // make sure we're getting the min
                                                   // amount value.
                )
            )
        )
    RETURN
        CALCULATE (
            SUM ( 'Input'[inventory amount] ),
            // and below is the other difference
            TREATAS (
                LastStoreDateTimes,
                'Input'[store],
                'Input'[date],
                'Input'[time],
                'Input'[inventory amount] // we're also only including rows that tie
                                          // on the min amount here.
            )
        )
    

    If you don't care about totaling behavior, you could also just replace the SUM with a MIN to get the smallest amount at a (store, date, time) tuple.

    0 讨论(0)
提交回复
热议问题