SQL - Find missing int values in mostly ordered sequential series

前端 未结 6 596
有刺的猬
有刺的猬 2020-12-06 08:59

I manage a message based system in which a sequence of unique integer ids will be entirely represented at the end of the day, though they will not necessarily arrive in orde

相关标签:
6条回答
  • 2020-12-06 09:02
    SET search_path='tmp';
    
    DROP table tmp.table_name CASCADE;
    CREATE table tmp.table_name ( num INTEGER NOT NULL PRIMARY KEY);
    -- make some data
    INSERT INTO tmp.table_name(num) SELECT generate_series(1,20);
    -- create some gaps
    DELETE FROM tmp.table_name WHERE random() < 0.3 ;
    
    SELECT * FROM table_name;
    
    -- EXPLAIN ANALYZE
    WITH zbot AS (
        SELECT 1+tn.num  AS num
        FROM table_name tn
        WHERE NOT EXISTS (
            SELECT * FROM table_name nx
            WHERE nx.num = tn.num+1
            )
        )
    , ztop AS (
        SELECT -1+tn.num  AS num
        FROM table_name tn
        WHERE NOT EXISTS (
            SELECT * FROM table_name nx
            WHERE nx.num = tn.num-1
            )
        )
    SELECT zbot.num AS bot
        ,ztop.num AS top
    FROM zbot, ztop
    WHERE zbot.num <= ztop.num
    AND NOT EXISTS ( SELECT *
        FROM table_name nx
        WHERE nx.num >= zbot.num
        AND nx.num <= ztop.num
        )
    ORDER BY bot,top
        ;
    

    Result:

    CREATE TABLE
    INSERT 0 20
    DELETE 9
     num 
    -----
       1
       2
       6
       7
      10
      11
      13
      14
      15
      18
      19
    (11 rows)
    
     bot | top 
    -----+-----
       3 |   5
       8 |   9
      12 |  12
      16 |  17
    (4 rows)
    

    Note: a recursive CTE is also possible (and probably shorter).

    UPDATE: here comes the recursive CTE ...:

    WITH RECURSIVE tree AS (
        SELECT 1+num AS num
        FROM table_name t0
        UNION
        SELECT 1+num FROM tree tt
        WHERE EXISTS ( SELECT *
            FROM table_name xt
            WHERE xt.num > tt.num
            )
        )
    SELECT * FROM tree
    WHERE NOT EXISTS (
        SELECT *
        FROM table_name nx
        WHERE nx.num = tree.num
        )
    ORDER BY num
        ;
    

    Results: (same data)

     num 
    -----
       3
       4
       5
       8
       9
      12
      16
      17
      20
     (9 rows)
    
    0 讨论(0)
  • 2020-12-06 09:06

    This is sometimes called an exclusion join. That is, try to do a join and return only rows where there is no match.

    SELECT t1.value-1
    FROM ThisTable AS t1
    LEFT OUTER JOIN ThisTable AS t2
      ON t1.id = t2.value+1
    WHERE t2.value IS NULL
    

    Note this will always report at least one row, which will be the MIN value.

    Also, if there are gaps of two or more numbers, it will only report one missing value.

    0 讨论(0)
  • 2020-12-06 09:09

    I applied it in mysql, it worked ..

    mysql> select * from sequence;
    +--------+
    | number |
    +--------+
    |      1 |
    |      2 |
    |      4 |
    |      6 |
    |      7 |
    |      8 |
    +--------+
    6 rows in set (0.00 sec)
    
    mysql> SELECT t1.number - 1 FROM sequence AS t1 LEFT OUTER JOIN sequence AS t2 O
    N t1.number = t2.number +1 WHERE t2.number IS NULL;
    +---------------+
    | t1.number - 1 |
    +---------------+
    |             0 |
    |             3 |
    |             5 |
    +---------------+
    3 rows in set (0.00 sec)
    
    0 讨论(0)
  • 2020-12-06 09:13

    You didn't state your DBMS, so I'm assuming PostgreSQL:

    select aid as missing_id
    from generate_series( (select min(id) from message), (select max(id) from message)) as aid
      left join message m on m.id = aid
    where m.id is null;  
    

    This will report any missing value in a sequence between the minimum and maximum id in your table (including gaps that are bigger than one)

    psql (9.1.1)
    Type "help" for help.
    
    postgres=> select * from message;
     id
    ----
      1
      2
      3
      4
      5
      7
      8
      9
     11
     14
    (10 rows)
    
    
    postgres=> select aid as missing_id
    postgres-> from generate_series( (select min(id) from message), (select max(id) from message)) as aid
    postgres->   left join message m on m.id = aid
    postgres-> where m.id is null;
     missing_id
    ------------
              6
             10
             12
             13
    (4 rows)
    postgres=>
    0 讨论(0)
  • 2020-12-06 09:27

    I've been there.

    FOR ORACLE:

    I found this extremely useful query on the net a while ago and noted down, however I don't remember the site now, you may search for "GAP ANALYSIS" on Google.

    SELECT   CASE
                 WHEN ids + 1 = lead_no - 1 THEN TO_CHAR (ids +1)
              ELSE TO_CHAR (ids + 1) || '-' || TO_CHAR (lead_no - 1)
             END
                 Missing_track_no
       FROM   (SELECT   ids,
                        LEAD (ids, 1, NULL)
                         OVER (ORDER BY ids ASC)
                            lead_no
                 FROM   YOURTABLE
                 )
       WHERE   lead_no != ids + 1
    

    Here, the result is:

    MISSING _TRACK_NO
    -----------------
           6
    

    If there were multiple gaps,say 2,6,7,9 then it would be:

    MISSING _TRACK_NO
    -----------------
            2
           6-7
            9
    
    0 讨论(0)
  • 2020-12-06 09:28
    select student_key, next_student_key
          from (
        select student_key, lead(student_key) over (order by student_key) next_fed_cls_prgrm_key
          from student_table
               )
    where student_key <> next_student_key-1;
    
    0 讨论(0)
提交回复
热议问题