Aggregate periods using recursive queries

社会主义新天地 提交于 2019-12-13 03:29:56

问题


I need to merge overlapping periods (defined by FROM and TO variables) of sequential events (with identifier NUM) for each group (ID) with a "lookahead buffer", meaning that if the next period starts within the buffer zone, they should be merged.

For instance; in the following example the second event (NUM = 2) starts at time 13, which is within the buffer zone (10 + 5 = 15).

The tricky part here compared to other similar problems I've found is that although the buffer period has a fixed value for each event, this could potentially change if it is merged with an event (only backwards) that has a longer buffer period.

For instance; Event three is also merged to the same periods as event 1 and 2, and because the buffer periods of these events are longer. The following buffer zone should instead be (25 + 5 = 30), rather than (25 + 3 = 28), meaning the following event 4 should also be included in these periods as well.

Once again the buffer period of event 4 is also changed to 5. However, because 40 > 31+5, the last event is a separate observation.

CREATE TABLE MY_TABLE(ID INTEGER, NUM INTEGER, FROM INTEGER, TO INTEGER, LOOKAHEAD INTEGER);
INSERT INTO MY_TABLE VALUES (1, 1, 1,  10, 5);
INSERT INTO MY_TABLE VALUES (1, 2, 13, 20, 5);
INSERT INTO MY_TABLE VALUES (1, 3, 21, 25, 3);
INSERT INTO MY_TABLE VALUES (1, 4, 29, 31, 3);
INSERT INTO MY_TABLE VALUES (1, 5, 40, 50, 3);

Eventually, the result I need are two observations with the two "disjunct" periods;

(ID = 1, FROM = 1,  TO = 31)
(ID = 5, FROM = 40, TO = 50)

Naturally I initially thought I could create this "LOOKHEAD"-variable, by creating a new variable LOOKAHEAD2 that is the maximum of previous value of LOOKAHEAD2 and current value of LOOKAHEAD, conditional on FROM(this record) < (TO + LOOKAHEAD)(previous record) using OLAP functions. This doesn't really work however since it is a reference to itself...

Instead, I tried using recursive queries, where I start with the first event (NUM = 1), and than recursively join the table with the next event (root.NUM+1 = next.NUM) conditional on (root.TO + root.LOOKAHEAD > next.FROM), and also updating the LOOKAHEAD variable accordingly.

But I have never used recursive queries before, and I can't get it to join on the updated value of the LOOKAHEAD-value.

Does anyone know how to solve this with either recursive queries or other?


回答1:


You should use the RESET WHEN window modifier in your analytic functions (LAG in Teradata 16, or MAX in earlier ones); don't use a recursive query.

Update:

DROP TABLE MY_TABLE;
CREATE VOLATILE TABLE MY_TABLE 
( id          INTEGER
, num         INTEGER
, from_value  INTEGER
, to_value    INTEGER
, lookahead   INTEGER
) ON COMMIT PRESERVE ROWS;

INSERT INTO MY_TABLE VALUES (1, 1, 1,  10, 5);
INSERT INTO MY_TABLE VALUES (1, 2, 13, 20, 5);
INSERT INTO MY_TABLE VALUES (1, 3, 21, 25, 3);
INSERT INTO MY_TABLE VALUES (1, 4, 29, 31, 3);
INSERT INTO MY_TABLE VALUES (1, 5, 40, 50, 3);

INSERT INTO MY_TABLE VALUES (2, 1, 1, 10, 5);
INSERT INTO MY_TABLE VALUES (2, 2, 20, 30, 15);
INSERT INTO MY_TABLE VALUES (2, 3, 40, 41, 5);
INSERT INTO MY_TABLE VALUES (2, 4, 100, 200, 5);
INSERT INTO MY_TABLE VALUES (2, 5, 300, 400, 3);


SELECT  id, first_from_value, to_value
FROM  ( SELECT  id
              , to_value
              , CASE WHEN overlaps_flag = 1
                  THEN  NULL
                  ELSE  COALESCE 
                        ( MIN (from_value) 
                            OVER (PARTITION BY id
                                  ORDER BY from_value
                                  RESET WHEN MAX (overlaps_flag) 
                                               OVER (PARTITION BY id 
                                                     ROWS BETWEEN 
                                                          1 PRECEDING 
                                                      AND 1 PRECEDING) = 0
                                  ROWS BETWEEN UNBOUNDED PRECEDING 
                                           AND 1 PRECEDING)
                        , from_value )
                END AS first_from_value
        FROM  ( SELECT  id, from_value, to_value
                      , MAX (from_value) 
                          OVER (PARTITION BY id 
                                ORDER BY from_value 
                                ROWS BETWEEN 1 FOLLOWING AND 1 FOLLOWING)
                          AS next_from_value
                      , CASE WHEN to_value + lookahead + 1 >= next_from_value
                          THEN 1 ELSE 0 
                        END AS overlaps_flag
                FROM  my_table
              ) AS a
      ) AS a
WHERE first_from_value IS NOT NULL
ORDER BY 1, 2
id  first_from_value    to_value
1   1                   31
1   40                  50
2   1                   10
2   20                  41
2   100                 200
2   300                 400


来源:https://stackoverflow.com/questions/51879557/aggregate-periods-using-recursive-queries

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!