问题
Hopefully I can get some help on this.
Situation
There are two incoming stations and one outgoing station. Items are scanned in and out. I need to know how long an item was in the station. Let's consider 'in station' to be the time between it's incoming date scan and it's outgoing date scan.
Problem
An item can be (accidentally) scanned multiple times into either station (for this I was thinking of identifying if a scan was made the same day (not looking at hours) then return the earliest time scanned)
An item can come in an out from the station multiple times (multiple in and out scans)
If an item was scanned into both incoming locations, need to get the earliest time
Sample of Data.. here we go
╔═════════╦════════╦══════════════════╦════════════════╦══════════╗
║ Row_num ║ ItemID ║ Dates ║ LocationName ║ Type ║
╠═════════╬════════╬══════════════════╬════════════════╬══════════╣
║ 1 ║ ItemA ║ 1/7/20 12:49 PM ║ Outgoing_Loc ║ Outgoing ║
║ 2 ║ ItemA ║ 1/2/20 7:29 AM ║ Incoming_Loc_A ║ Incoming ║
║ 3 ║ ItemB ║ 1/3/20 11:01 AM ║ Outgoing_Loc ║ Outgoing ║
║ 4 ║ ItemB ║ 1/2/20 4:57 PM ║ Incoming_Loc_B ║ Incoming ║
║ 5 ║ ItemB ║ 1/2/20 5:01 PM ║ Incoming_Loc_A ║ Incoming ║
║ 6 ║ ItemB ║ 12/12/19 5:58 PM ║ Outgoing_Loc ║ Outgoing ║
║ 7 ║ ItemB ║ 12/12/19 5:57 PM ║ Outgoing_Loc ║ Outgoing ║
║ 8 ║ ItemB ║ 5/20/19 10:19 AM ║ Outgoing_Loc ║ Outgoing ║
║ 9 ║ ItemC ║ 1/9/20 9:20 AM ║ Outgoing_Loc ║ Outgoing ║
║ 10 ║ ItemC ║ 1/2/20 6:42 PM ║ Incoming_Loc_A ║ Incoming ║
║ 11 ║ ItemC ║ 12/20/19 5:54 AM ║ Outgoing_Loc ║ Outgoing ║
║ 12 ║ ItemC ║ 10/10/19 6:13 PM ║ Outgoing_Loc ║ Outgoing ║
║ 13 ║ ItemC ║ 10/5/19 7:00 PM ║ Incoming_Loc_A ║ Incoming ║
║ 14 ║ ItemC ║ 7/16/19 9:18 AM ║ Outgoing_Loc ║ Outgoing ║
╚═════════╩════════╩══════════════════╩════════════════╩══════════╝
I tried to provide every type of problem into the table distributed to the different Items.
The perfect transaction is ItemA, it's so simple and clean, if they were all like this then I could just join the tables and pull them on separate columns.
ItemB, You'll notice this one was scanned to both of the incoming locations, but I only need to return one- the earliest it came in from that batch. Additionally, need to return the incoming that is after the oldest outgoing(12/12/19) and before the last outgoing(1/3/20).
ItemC, similar to the last statement for ItemB, this item came in and out from the locations twice. Need to get the incoming and outgoing pair that makes the most sense chronologically.
I don't know how hard this is to figure out, but I'm having a tough time finding a solution for it. I'm not sure how to squeeze in the incoming date between the outgoing.
Example of Output:
Need to get how many days each item was in the station. If the item has been in-n-out multiple times, need to pair the incoming and outgoing that makes the most sense chronologically. ItemC for example, has multiple incoming and outgoing dates, but I only need the dates that have a beginning and end as a pair.
+--------+-----------------+------------------+-----------------+
| ItemID | Incoming | Outgoing | Days in Station |
+--------+-----------------+------------------+-----------------+
| ItemA | 1/2/20 7:29 AM | 1/7/20 12:49 PM | 5.00 |
| ItemB | 1/2/20 4:57 PM | 1/3/20 11:01 AM | 1.00 |
| ItemC | 1/2/20 6:42 PM | 1/9/20 9:20 AM | 7.00 |
| ItemC | 10/5/19 7:00 PM | 10/10/19 6:13 PM | 5.00 |
+--------+-----------------+------------------+-----------------+
回答1:
This is a gaps-and-island problem. An approach is to define groups using a cumulative sum that increments for every incoming record, and use that for aggregation:
select
itemID,
min(dates) incoming,
max(dates) outgoing,
datediff(second, min(dates), max(dates)) / 60.0 / 60 / 24 days_in_station
from (
select
t.*,
sum(case when type = 'Incoming' then 1 else 0 end)
over(partition by itemID order by dates) grp
from mytable t
) t
group by itemID, grp
Your question does not specify what should happen when incoming/outgoing records do not properly interleave for a given item. Here is how the query would handle that:
if there are two consecutive incoming records, this generates a row in the resultset where the incoming and outgoing dates are identical, and days at station is
0
if there are two or more consecutive outgoing records, only the last one is considered
These could be fine tuned if more details on the requirement were provided.
来源:https://stackoverflow.com/questions/62400218/return-duration-of-an-item-from-its-transactions-many-to-many-sql