What is a simple and efficient way to find rows with time-interval overlaps in SQL?

后端 未结 5 1531
清歌不尽
清歌不尽 2020-11-30 00:22

I have two tables, both with start time and end time fields. I need to find, for each row in the first table, all of the rows in the second table where the time intervals in

相关标签:
5条回答
  • 2020-11-30 00:29
    select * from table_1 
    right join 
    table_2 on 
    (
    table_1.start between table_2.start and table_2.[end]
    or
    table_1.[end] between table_2.start and table_2.[end]
    or
    (table_1.[end] > table_2.start and table_2.[end] is null)
    )
    

    EDIT: Ok, don't go for my solution, it perfoms like shit. The "where" solution is 14x faster. Oops...

    Some statistics: running on a db with ~ 65000 records for both table 1 and 2 (no indexing), having intervals of 2 days between start and end for each row, running for 2 minutes in SQLSMSE (don't have the patience to wait)

    Using join: 8356 rows in 2 minutes

    Using where: 115436 rows in 2 minutes

    0 讨论(0)
  • 2020-11-30 00:35

    It's sound very complicated until you start working from reverse. Below I illustrated ONLY GOOD CASES (no overlaps)! defined by those 2 simple conditions, we have no overlap ranges if condA OR condB is TRUE, so we going to reverse those: NOT condA AND NOT CondB, in our case I just reversed signs (> became <=)

    /*
    |--------| A                             \___  CondA: b.ddStart >  a.ddEnd
                |=========| B                /      \____ CondB:  a.ddS >  b.ddE
                              |+++++++++| A         /
    */
    --DROP TABLE ran
    create table ran ( mem_nbr int, ID int, ddS date, ddE date)
    insert ran values  
    (100, 1,  '2012-1-1','2012-12-30'),    ----\ ovl
    (100, 11, '2012-12-12','2012-12-24'),  ----/
    (100, 2, '2012-12-31','2014-1-1'),
    (100, 3, '2014-5-1','2014-12-14') ,
    
    (220, 1, '2015-5-5','2015-12-14') ,    ---\ovl
    (220, 22, '2014-4-1','2015-5-25') ,    ---/
    (220, 3, '2016-6-1','2016-12-16')  
    
    select  DISTINCT a.mem_nbr ,  a.* , '-' [ ], b.dds, b.dde, b.id 
    FROM ran a
    join ran b  on  a.mem_nbr = b.mem_nbr          -- match by mem#
                   AND     a.ID <> b.ID            -- itself
                      AND     b.ddS <= a.ddE        -- NOT    b.ddS >  a.ddE       
                      AND     a.ddS <= b.ddE        -- NOT    a.ddS >  b.ddE   
    
    0 讨论(0)
  • 2020-11-30 00:37

    And what, if you want to analyse such an overlap on a minute precision with 70m+ rows? the only solution i could make up myself was a time dimension table for the join

    else the dublicate-handling became a headache .. and the processing cost where astronomical

    0 讨论(0)
  • 2020-11-30 00:48

    "solutions from other platforms may be of interest also."

    SQL Standard defines OVERLAPS predicate:

    Specify a test for an overlap between two events.

    <overlaps predicate> ::=
       <row value constructor 1>  OVERLAPS <row value constructor 2> 
    

    Example:

    SELECT 1
    WHERE ('2020-03-01'::DATE, '2020-04-15'::DATE) OVERLAPS 
          ('2020-02-01'::DATE, '2020-03-15'::DATE) 
    -- 1
    

    db<>fiddle demo

    0 讨论(0)
  • 2020-11-30 00:49
    SELECT * 
    FROM table1,table2 
    WHERE table2.start <= table1.end 
    AND (table2.end IS NULL OR table2.end >= table1.start)
    
    0 讨论(0)
提交回复
热议问题