Compute lag difference for different days

本秂侑毒 提交于 2019-12-11 14:19:08

问题


I need help computing a date difference across different rows with variable lag (specifically, rows that are not on the same day) without subqueries, joins, etc. I think this should be possible with some inline t-SQL aggregates that use OVER(PARTITION BY) clause, such as LAG, DENSE_RANK, etc., but I can't quite put a finger on it. This is for a SQL Server 2017 Developer's edition.

A clarifying example:

Consider a dataset with Job beginning and end dates (across various projects). Some jobs start and end on the same day (such as jobs 2 & 3, 4 & 5). I need to compute the idle time between consequent jobs that started on different days (per project). That is the days between last job's ending time and current job's beginning time. If the previous job started on the same day, then look further back in history of the same project. I.e. the jobs that started on the same day can be considered as parts of the same job.

UPDATE: I simplified the code/output by dropping time values (question's history has original dataset).

IF OBJECT_ID('tempdb..#t') IS NOT NULL DROP TABLE #t; 
CREATE TABLE #t(Prj TINYINT, Beg DATE, Eñd DATE);
INSERT INTO #t SELECT 1, '1/1/17', '1/2/17';
INSERT INTO #t SELECT 1, '1/5/17', '1/7/17';
INSERT INTO #t SELECT 1, '1/5/17', '1/7/17';
INSERT INTO #t SELECT 1, '1/15/17', '1/15/17';
INSERT INTO #t SELECT 1, '1/15/17', '1/18/17';
INSERT INTO #t SELECT 1, '1/20/17', '1/24/17';
INSERT INTO #t SELECT 2, '2/2/17', '2/5/17';
INSERT INTO #t SELECT 2, '2/7/17', '2/9/17';
ALTER TABLE #t ADD Job INT NOT NULL IDENTITY (1,1) PRIMARY KEY;

A LAG(.,1) function uses precisely the previous job's ending time, which is not what I want. It yields incorrect idle duration for jobs 2 & 3, 4 & 5. Jobs 2 & 3 should both use the ending time of job 1. Jobs 4 & 5 should both use the ending time of job 3. The joined query computes idle duration correctly, but an inline calculation is desirable here (without joins, subqueries).

SELECT c.Job, c.Prj, c.Beg, c.Eñd, 
-- in-line computation with OVER clause
PrvEñd_lg=LAG(c.Eñd,1) OVER(PARTITION BY c.Prj ORDER BY c.Beg),
Idle_lg=DATEDIFF(DAY, LAG(c.Eñd,1) OVER(PARTITION BY c.Prj ORDER BY c.Beg), c.Beg),
-- calculation over current and (joined) previous records
PrvEñd_j=MAX(p.Eñd), 
IdleDur_j=DATEDIFF(DAY, MAX(p.Eñd), c.Beg)
FROM #t c LEFT JOIN #t p ON c.Prj=p.Prj AND c.Beg > p.Eñd
GROUP BY c.Job, c.Prj, c.Beg, c.Eñd
ORDER BY c.Prj, c.Beg


Job Prj Beg         Eñd         PrvEñd_lg   Idle_lg PrvEñd_j    IdleDur_j
1   1   2017-01-01  2017-01-02  NULL        NULL    NULL        NULL
2   1   2017-01-05  2017-01-07  2017-01-02  3       2017-01-02  3
3   1   2017-01-05  2017-01-07  2017-01-07  -2      2017-01-02  3
4   1   2017-01-15  2017-01-15  2017-01-07  8       2017-01-07  8
5   1   2017-01-15  2017-01-18  2017-01-15  0       2017-01-07  8
6   1   2017-01-20  2017-01-24  2017-01-18  2       2017-01-18  2
7   2   2017-02-02  2017-02-05  NULL        NULL    NULL        NULL
8   2   2017-02-07  2017-02-09  2017-02-05  2       2017-02-05  2

Please let me know, if I can further clarify any specific details.

Many thanks!


回答1:


You can use a self-join.

select a.Job
, a.Prj
, a.Beg
, a.Eñd
, max(b.Eñd) as PrevEñd
, min(datediff(mi, b.Eñd, a.Beg) / (60*24.0)) as IdleDur
from #t as a
left join #t as b on a.Prj = b.Prj
                 and cast(a.Beg as date) > cast(b.Eñd as date)
group by a.Job
, a.Prj
, a.Beg
, a.Eñd

This produces the following output:

+-----+-----+---------------------+---------------------+---------------------+-----------+
| Job | Prj |         Beg         |         Eñd         |       PrevEñd       |  IdleDur  |
+-----+-----+---------------------+---------------------+---------------------+-----------+
|   1 |   1 | 2017-01-01 01:00:00 | 2017-01-02 02:00:00 | NULL                | NULL      |
|   2 |   1 | 2017-01-05 02:00:00 | 2017-01-07 03:00:00 | 2017-01-02 02:00:00 | 3.0000000 |
|   3 |   1 | 2017-01-05 03:00:00 | 2017-01-07 02:00:00 | 2017-01-02 02:00:00 | 3.0416666 |
|   4 |   1 | 2017-01-15 04:00:00 | 2017-01-15 03:00:00 | 2017-01-07 03:00:00 | 8.0416666 |
|   5 |   1 | 2017-01-15 15:00:00 | 2017-01-18 03:00:00 | 2017-01-07 03:00:00 | 8.5000000 |
|   6 |   1 | 2017-01-20 05:00:00 | 2017-01-24 02:00:00 | 2017-01-18 03:00:00 | 2.0833333 |
|   7 |   2 | 2017-02-02 06:00:00 | 2017-02-05 03:00:00 | NULL                | NULL      |
|   8 |   2 | 2017-02-07 07:00:00 | 2017-02-09 02:00:00 | 2017-02-05 03:00:00 | 2.1666666 |
+-----+-----+---------------------+---------------------+---------------------+-----------+


来源:https://stackoverflow.com/questions/47720899/compute-lag-difference-for-different-days

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!