I have two tables, both with start time and end time fields. I need to find, for each row in the first table, all of the rows in the second table where the time intervals in
select * from table_1
right join
table_2 on
(
table_1.start between table_2.start and table_2.[end]
or
table_1.[end] between table_2.start and table_2.[end]
or
(table_1.[end] > table_2.start and table_2.[end] is null)
)
EDIT: Ok, don't go for my solution, it perfoms like shit. The "where" solution is 14x faster. Oops...
Some statistics: running on a db with ~ 65000 records for both table 1 and 2 (no indexing), having intervals of 2 days between start and end for each row, running for 2 minutes in SQLSMSE (don't have the patience to wait)
Using join: 8356 rows in 2 minutes
Using where: 115436 rows in 2 minutes
It's sound very complicated until you start working from reverse. Below I illustrated ONLY GOOD CASES (no overlaps)! defined by those 2 simple conditions, we have no overlap ranges if condA OR condB is TRUE, so we going to reverse those: NOT condA AND NOT CondB, in our case I just reversed signs (> became <=)
/*
|--------| A \___ CondA: b.ddStart > a.ddEnd
|=========| B / \____ CondB: a.ddS > b.ddE
|+++++++++| A /
*/
--DROP TABLE ran
create table ran ( mem_nbr int, ID int, ddS date, ddE date)
insert ran values
(100, 1, '2012-1-1','2012-12-30'), ----\ ovl
(100, 11, '2012-12-12','2012-12-24'), ----/
(100, 2, '2012-12-31','2014-1-1'),
(100, 3, '2014-5-1','2014-12-14') ,
(220, 1, '2015-5-5','2015-12-14') , ---\ovl
(220, 22, '2014-4-1','2015-5-25') , ---/
(220, 3, '2016-6-1','2016-12-16')
select DISTINCT a.mem_nbr , a.* , '-' [ ], b.dds, b.dde, b.id
FROM ran a
join ran b on a.mem_nbr = b.mem_nbr -- match by mem#
AND a.ID <> b.ID -- itself
AND b.ddS <= a.ddE -- NOT b.ddS > a.ddE
AND a.ddS <= b.ddE -- NOT a.ddS > b.ddE
And what, if you want to analyse such an overlap on a minute precision with 70m+ rows? the only solution i could make up myself was a time dimension table for the join
else the dublicate-handling became a headache .. and the processing cost where astronomical
"solutions from other platforms may be of interest also."
SQL Standard defines OVERLAPS predicate:
Specify a test for an overlap between two events.
<overlaps predicate> ::= <row value constructor 1> OVERLAPS <row value constructor 2>
Example:
SELECT 1
WHERE ('2020-03-01'::DATE, '2020-04-15'::DATE) OVERLAPS
('2020-02-01'::DATE, '2020-03-15'::DATE)
-- 1
db<>fiddle demo
SELECT *
FROM table1,table2
WHERE table2.start <= table1.end
AND (table2.end IS NULL OR table2.end >= table1.start)