Get total time interval from multiple rows if sequence not broken

后端 未结 2 971
傲寒
傲寒 2021-01-21 19:31

I have Work and Person tables (these are just examples to understand problem).

Structure

Work table

id         


        
2条回答
  •  孤城傲影
    2021-01-21 20:09

    You could try postgres's WITH RECURSIVE construct. (after all, a linked list is a kind of tree) Getting the boundary conditions right will be a problem, but at least it would solve the problem without the need for loops.

    UPDATE: Added code. The problem with RECURSIVE is that you can only specify only the "tail" boundary condition. To specify the "head" condition, you need to wrap it into a view.

    CREATE VIEW collected_time AS (
        WITH RECURSIVE ztree(person_id, dt_from, dt_to)  AS ( 
        -- Terminal part
        SELECT  pr.person_id, pr.dt_from, pr.dt_to
        FROM prikklok pr
        WHERE NOT EXISTS (
                SELECT * FROM prikklok px
                WHERE px.person_id = pr.person_id AND px.dt_from = pr.dt_to
                )
        UNION
        -- Recursive part
        SELECT  p1.person_id AS person_id
        , p1.dt_from AS dt_from
        , p2.dt_to AS dt_to
        FROM prikklok AS p1
        , ztree AS p2
        WHERE p1.person_id = p2.person_id
        AND p1.dt_to = p2.dt_from
        )
    SELECT *
    FROM ztree zt
    WHERE NOT EXISTS (select *
        FROM prikklok p3
        WHERE p3.person_id = zt.person_id
            AND p3.dt_to = zt.dt_from
                )
        );
    
    SELECT * FROM collected_time;
    

    -- now generate some data with gaps

    INSERT INTO prikklok
        SELECT serie_n
        , serie_t
        , serie_t + '1 month'::interval
    FROM generate_series (1,10) serie_n
        , generate_series ( '1970-01-01 00:00:00' , '2011-09-01 00:00:00' , '1 month' ::interval) serie_t
        ;
    
    DELETE FROM prikklok
    WHERE random() <  0.001
        ;
    
    -- a few indexes won't hurt
    ALTER TABLE prikklok ADD PRIMARY KEY (person_id,dt_from)
        ;
    CREATE UNIQUE INDEX ON prikklok (person_id,dt_to);
    

    The resulting query plan looks perfect:

                                                                                  QUERY PLAN                                                                           
    -----------------------------------------------------------------------------------------------------------------------------------------------------------------------
     Nested Loop Anti Join  (cost=1389.73..1469.09 rows=1 width=20) (actual time=13.580..40.920 rows=16 loops=1)
       CTE ztree
         ->  Recursive Union  (cost=0.00..1389.73 rows=11 width=20) (actual time=0.136..27.405 rows=5004 loops=1)
               ->  Merge Anti Join  (cost=0.00..638.92 rows=1 width=20) (actual time=0.130..10.011 rows=16 loops=1)
                     Merge Cond: ((pr.person_id = px.person_id) AND (pr.dt_to = px.dt_from))
                     ->  Index Scan using prikklok_person_id_dt_to_idx on prikklok pr  (cost=0.00..291.31 rows=5004 width=20) (actual time=0.063..2.273 rows=5004 loops=1)
                     ->  Index Scan using prikklok_pkey on prikklok px  (cost=0.00..291.31 rows=5004 width=12) (actual time=0.012..2.204 rows=5004 loops=1)
               ->  Nested Loop  (cost=0.00..75.06 rows=1 width=20) (actual time=0.002..0.027 rows=10 loops=501)
                     ->  WorkTable Scan on ztree p2  (cost=0.00..0.20 rows=10 width=20) (actual time=0.000..0.001 rows=10 loops=501)
                     ->  Index Scan using prikklok_person_id_dt_to_idx on prikklok p1  (cost=0.00..7.47 rows=1 width=20) (actual time=0.002..0.002 rows=1 loops=5004)
                           Index Cond: ((p1.person_id = p2.person_id) AND (p1.dt_to = p2.dt_from))
       ->  CTE Scan on ztree zt  (cost=0.00..0.22 rows=11 width=20) (actual time=0.138..29.887 rows=5004 loops=1)
       ->  Index Scan using prikklok_person_id_dt_to_idx on prikklok p3  (cost=0.00..7.18 rows=1 width=12) (actual time=0.002..0.002 rows=1 loops=5004)
             Index Cond: ((p3.person_id = zt.person_id) AND (p3.dt_to = zt.dt_from))
     Total runtime: 41.354 ms
    (15 rows)
    

提交回复
热议问题