Return duration of an item from its transactions, many to many, SQL

不问归期 提交于 2020-07-04 00:19:01

问题


Hopefully I can get some help on this.

Situation

There are two incoming stations and one outgoing station. Items are scanned in and out. I need to know how long an item was in the station. Let's consider 'in station' to be the time between it's incoming date scan and it's outgoing date scan.

Problem

  1. An item can be (accidentally) scanned multiple times into either station (for this I was thinking of identifying if a scan was made the same day (not looking at hours) then return the earliest time scanned)

  2. An item can come in an out from the station multiple times (multiple in and out scans)

  3. If an item was scanned into both incoming locations, need to get the earliest time

Sample of Data.. here we go

╔═════════╦════════╦══════════════════╦════════════════╦══════════╗
║ Row_num ║ ItemID ║      Dates       ║  LocationName  ║   Type   ║
╠═════════╬════════╬══════════════════╬════════════════╬══════════╣
║       1 ║ ItemA  ║ 1/7/20 12:49 PM  ║ Outgoing_Loc   ║ Outgoing ║
║       2 ║ ItemA  ║ 1/2/20 7:29 AM   ║ Incoming_Loc_A ║ Incoming ║
║       3 ║ ItemB  ║ 1/3/20 11:01 AM  ║ Outgoing_Loc   ║ Outgoing ║
║       4 ║ ItemB  ║ 1/2/20 4:57 PM   ║ Incoming_Loc_B ║ Incoming ║
║       5 ║ ItemB  ║ 1/2/20 5:01 PM   ║ Incoming_Loc_A ║ Incoming ║
║       6 ║ ItemB  ║ 12/12/19 5:58 PM ║ Outgoing_Loc   ║ Outgoing ║
║       7 ║ ItemB  ║ 12/12/19 5:57 PM ║ Outgoing_Loc   ║ Outgoing ║
║       8 ║ ItemB  ║ 5/20/19 10:19 AM ║ Outgoing_Loc   ║ Outgoing ║
║       9 ║ ItemC  ║ 1/9/20 9:20 AM   ║ Outgoing_Loc   ║ Outgoing ║
║      10 ║ ItemC  ║ 1/2/20 6:42 PM   ║ Incoming_Loc_A ║ Incoming ║
║      11 ║ ItemC  ║ 12/20/19 5:54 AM ║ Outgoing_Loc   ║ Outgoing ║
║      12 ║ ItemC  ║ 10/10/19 6:13 PM ║ Outgoing_Loc   ║ Outgoing ║
║      13 ║ ItemC  ║ 10/5/19 7:00 PM  ║ Incoming_Loc_A ║ Incoming ║
║      14 ║ ItemC  ║ 7/16/19 9:18 AM  ║ Outgoing_Loc   ║ Outgoing ║
╚═════════╩════════╩══════════════════╩════════════════╩══════════╝

I tried to provide every type of problem into the table distributed to the different Items.

The perfect transaction is ItemA, it's so simple and clean, if they were all like this then I could just join the tables and pull them on separate columns.

ItemB, You'll notice this one was scanned to both of the incoming locations, but I only need to return one- the earliest it came in from that batch. Additionally, need to return the incoming that is after the oldest outgoing(12/12/19) and before the last outgoing(1/3/20).

ItemC, similar to the last statement for ItemB, this item came in and out from the locations twice. Need to get the incoming and outgoing pair that makes the most sense chronologically.

I don't know how hard this is to figure out, but I'm having a tough time finding a solution for it. I'm not sure how to squeeze in the incoming date between the outgoing.

Example of Output:
Need to get how many days each item was in the station. If the item has been in-n-out multiple times, need to pair the incoming and outgoing that makes the most sense chronologically. ItemC for example, has multiple incoming and outgoing dates, but I only need the dates that have a beginning and end as a pair.

+--------+-----------------+------------------+-----------------+
| ItemID |    Incoming     |     Outgoing     | Days in Station |
+--------+-----------------+------------------+-----------------+
| ItemA  | 1/2/20 7:29 AM  | 1/7/20 12:49 PM  | 5.00            |
| ItemB  | 1/2/20 4:57 PM  | 1/3/20 11:01 AM  | 1.00            |
| ItemC  | 1/2/20 6:42 PM  | 1/9/20 9:20 AM   | 7.00            |
| ItemC  | 10/5/19 7:00 PM | 10/10/19 6:13 PM | 5.00            |
+--------+-----------------+------------------+-----------------+

回答1:


This is a gaps-and-island problem. An approach is to define groups using a cumulative sum that increments for every incoming record, and use that for aggregation:

select
    itemID,
    min(dates) incoming,
    max(dates) outgoing,
    datediff(second, min(dates), max(dates)) / 60.0 / 60 / 24 days_in_station
from (
    select
        t.*,
        sum(case when type = 'Incoming' then 1 else 0 end)
            over(partition by itemID order by dates) grp
    from mytable t
) t
group by itemID, grp

Your question does not specify what should happen when incoming/outgoing records do not properly interleave for a given item. Here is how the query would handle that:

  • if there are two consecutive incoming records, this generates a row in the resultset where the incoming and outgoing dates are identical, and days at station is 0

  • if there are two or more consecutive outgoing records, only the last one is considered

These could be fine tuned if more details on the requirement were provided.



来源:https://stackoverflow.com/questions/62400218/return-duration-of-an-item-from-its-transactions-many-to-many-sql

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!