Aggregate Overlapping Segments to Measure Effective Length

前端 未结 6 798
我寻月下人不归
我寻月下人不归 2021-02-07 02:08

I have a road_events table:

create table road_events (
    event_id number(4,0),
    road_id number(4,0),
    year number(4,0),
    from_meas number         


        
6条回答
  •  死守一世寂寞
    2021-02-07 02:23

    My main DBMS is Teradata, but this will work as-is in Oracle, too.

    WITH all_meas AS
     ( -- get a distinct list of all from/to points
       SELECT road_id, from_meas AS meas
       FROM road_events
       UNION
       SELECT road_id, to_meas
       FROM road_events
     )
    -- select * from all_meas order by 1,2
     , all_ranges AS
     ( -- create from/to ranges
       SELECT road_id, meas AS from_meas 
         ,Lead(meas)
          Over (PARTITION BY road_id
                ORDER BY meas) AS to_meas
       FROM all_meas
      )
     -- SELECT * from all_ranges order by 1,2
    , all_event_ranges AS
     ( -- now match the ranges to the event ranges
       SELECT 
          ar.*
         ,re.event_id
         ,re.year
         ,re.total_road_length
         ,ar.to_meas - ar.from_meas AS event_length
         -- used to filter the latest event as multiple events might cover the same range 
         ,Row_Number()
          Over (PARTITION BY ar.road_id, ar.from_meas
                ORDER BY year DESC) AS rn
       FROM all_ranges ar
       JOIN road_events re
         ON ar.road_id = re.road_id
        AND ar.from_meas < re.to_meas
        AND ar.to_meas > re.from_meas
       WHERE ar.to_meas IS NOT NULL
     )
    SELECT event_id, road_id, year, total_road_length, Sum(event_length)
    FROM all_event_ranges
    WHERE rn = 1 -- latest year only
    GROUP BY event_id, road_id, year, total_road_length
    ORDER BY road_id, year DESC;
    

    If you need to return the actual covered from/to_meas (as in your question before edit), it might be more complicated. The first part is the same, but without aggregation the query can return adjacent rows with the same event_id (e.g. for event 3: 0-1 & 1-25):

    SELECT * FROM all_event_ranges
    WHERE rn = 1
    ORDER BY road_id, from_meas;
    

    If you want to merge adjacent rows you need two more steps (using a standard approach, flag the 1st row of a group and calculate a group number):

    WITH all_meas AS
     (
       SELECT road_id, from_meas AS meas
       FROM road_events
       UNION
       SELECT road_id, to_meas
       FROM road_events
     )
    -- select * from all_meas order by 1,2
     , all_ranges AS
     ( 
       SELECT road_id, meas AS from_meas 
         ,Lead(meas)
          Over (PARTITION BY road_id
                ORDER BY meas) AS to_meas
       FROM all_meas
      )
    -- SELECT * from all_ranges order by 1,2
    , all_event_ranges AS
     (
       SELECT 
          ar.*
         ,re.event_id
         ,re.year
         ,re.total_road_length
         ,ar.to_meas - ar.from_meas AS event_length
         ,Row_Number()
          Over (PARTITION BY ar.road_id, ar.from_meas
                ORDER BY year DESC) AS rn
       FROM all_ranges ar
       JOIN road_events  re
         ON ar.road_id = re.road_id
        AND ar.from_meas < re.to_meas
        AND ar.to_meas > re.from_meas
       WHERE ar.to_meas IS NOT NULL
     )
    -- SELECT * FROM all_event_ranges WHERE rn = 1 ORDER BY road_id, from_meas
    , adjacent_events AS 
     ( -- assign 1 to the 1st row of an event
       SELECT t.*
         ,CASE WHEN Lag(event_id)
                    Over(PARTITION BY road_id
                         ORDER BY from_meas) = event_id
               THEN 0 
               ELSE 1 
          END AS flag
       FROM all_event_ranges t
       WHERE rn = 1
     )
    -- SELECT * FROM adjacent_events ORDER BY road_id, from_meas 
    , grouped_events AS
     ( -- assign a groupnumber to adjacent rows using a Cumulative Sum over 0/1
       SELECT t.*
         ,Sum(flag)
          Over (PARTITION BY road_id
                ORDER BY from_meas
                ROWS Unbounded Preceding) AS grp
       FROM adjacent_events t
    )
    -- SELECT * FROM grouped_events ORDER BY  road_id, from_meas
    SELECT event_id, road_id, year, Min(from_meas), Max(to_meas), total_road_length, Sum(event_length)
    FROM grouped_events
    GROUP BY event_id, road_id, grp, year, total_road_length
    ORDER BY 2, Min(from_meas);
    

    Edit:

    Ups, I just found a blog Overlapping ranges with priority doing exactly the same with some simplified Oracle syntax. In fact I translated my query from a some other simplified syntax in Teradata to Standard/Oracle SQL :-)

提交回复
热议问题