Aggregate Overlapping Segments to Measure Effective Length

前端 未结 6 797
我寻月下人不归
我寻月下人不归 2021-02-07 02:08

I have a road_events table:

create table road_events (
    event_id number(4,0),
    road_id number(4,0),
    year number(4,0),
    from_meas number         


        
6条回答
  •  挽巷
    挽巷 (楼主)
    2021-02-07 02:29

    Thought about this too much today, but I have something that ignores the +/- 10 meters now.

    First made a function that takes in to / from pairs as a string and returns the distance covered by the pairs in the string. For example '10:20;35:45' returns 20.

    CREATE
        OR replace FUNCTION get_distance_range_str (strRangeStr VARCHAR2)
    
    RETURN NUMBER IS intRetNum NUMBER;
    
    BEGIN
        --split input string
        WITH cte_1
        AS (
            SELECT regexp_substr(strRangeStr, '[^;]+', 1, LEVEL) AS TO_FROM_STRING
            FROM dual connect BY regexp_substr(strRangeStr, '[^;]+', 1, LEVEL) IS NOT NULL
            )
            --split From/To pairs
            ,cte_2
        AS (
            SELECT cte_1.TO_FROM_STRING
                ,to_number(substr(cte_1.TO_FROM_STRING, 1, instr(cte_1.TO_FROM_STRING, ':') - 1)) AS FROM_MEAS
                ,to_number(substr(cte_1.TO_FROM_STRING, instr(cte_1.TO_FROM_STRING, ':') + 1, length(cte_1.TO_FROM_STRING) - instr(cte_1.TO_FROM_STRING, ':'))) AS TO_MEAS
            FROM cte_1
            )
            --merge ranges
            ,cte_merge_ranges
        AS (
            SELECT s1.FROM_MEAS
                ,
                --t1.TO_MEAS 
                MIN(t1.TO_MEAS) AS TO_MEAS
            FROM cte_2 s1
            INNER JOIN cte_2 t1 ON s1.FROM_MEAS <= t1.TO_MEAS
                AND NOT EXISTS (
                    SELECT *
                    FROM cte_2 t2
                    WHERE t1.TO_MEAS >= t2.FROM_MEAS
                        AND t1.TO_MEAS < t2.TO_MEAS
                    )
            WHERE NOT EXISTS (
                    SELECT *
                    FROM cte_2 s2
                    WHERE s1.FROM_MEAS > s2.FROM_MEAS
                        AND s1.FROM_MEAS <= s2.TO_MEAS
                    )
            GROUP BY s1.FROM_MEAS
            )
        SELECT sum(TO_MEAS - FROM_MEAS) AS DISTANCE_COVERED
        INTO intRetNum
        FROM cte_merge_ranges;
    
        RETURN intRetNum;
    END;
    

    Then wrote this query that builds a string for that function for the appropriate prior range. Couldn't use windowing with list_agg, but was able to achieve same with a correlated subquery.

    --use list agg to create list of to/from pairs for rows before current row in the ordering
    WITH cte_2
    AS (
        SELECT T1.*
            ,(
                SELECT LISTAGG(FROM_MEAS || ':' || TO_MEAS || ';') WITHIN
                GROUP (
                        ORDER BY ORDER BY YEAR DESC, EVENT_ID DESC
                        )
                FROM road_events T2
                WHERE T1.YEAR || lpad(T1.EVENT_ID, 10,'0') < 
                    T2.YEAR || lpad(T2.EVENT_ID, 10,'0')
                    AND T1.ROAD_ID = T2.ROAD_ID
                GROUP BY road_id
                ) AS PRIOR_RANGES_STR
        FROM road_events T1
        )
        --get distance for prior range string - distance ignoring current row
        --get distance including current row
        ,cte_3
    AS (
        SELECT cte_2.*
            ,coalesce(get_distance_range_str(PRIOR_RANGES_STR), 0) AS DIST_PRIOR
            ,get_distance_range_str(PRIOR_RANGES_STR || FROM_MEAS || ':' || TO_MEAS || ';') AS DIST_NOW
        FROM cte_2 cte_2
        )
        --distance including current row less distance ignoring current row is distance added to the range this row
        ,cte_4
    AS (
        SELECT cte_3.*
            ,DIST_NOW - DIST_PRIOR AS DIST_ADDED_THIS_ROW
        FROM cte_3
        )
    SELECT *
    FROM cte_4
    --filter out any rows with distance added as 0
    WHERE DIST_ADDED_THIS_ROW > 0
    ORDER BY ROAD_ID, YEAR DESC, EVENT_ID DESC
    

    sqlfiddle here: http://sqlfiddle.com/#!4/81331/36

    Looks to me like the results match yours. I left the additional columns in the final query to try to illustrate each step.

    Works on the test case - might need some work to handle all possibilities in a larger data set, but I think this would be a good place to start and refine.

    Credit for Overlapping range merge is first answer here: Merge overlapping date intervals

    Credit for list_agg with windowing is first answer here: LISTAGG equivalent with windowing clause

提交回复
热议问题