Only select first row of repeating value in a column in SQL

前端 未结 4 1876
南方客
南方客 2021-02-08 10:07

I have table that has a column that may have same values in a burst. Like this:

+----+---------+
| id |   Col1  | 
+----+---------+
| 1  | 6050000 |
+----+----         


        
4条回答
  •  北海茫月
    2021-02-08 10:38

    You can use a EXISTS semi-join to identify candidates:

    Select wanted rows:

    SELECT * FROM tbl t
    WHERE  NOT EXISTS (
        SELECT *
        FROM   tbl
        WHERE  col1 = t.col1
        AND    id = t.id - 1
        )
    ORDER  BY id;
    

    Get rid of unwanted rows:

    DELETE FROM tbl AS t
    -- SELECT * FROM tbl t  -- check first?
    WHERE EXISTS (
        SELECT *
        FROM   tbl
        WHERE  col1 = t.col1
        AND    id   = t.id - 1
        );
    

    This effectively deletes every row, where the preceding row has the same value in col1, thereby arriving at your set goal: only the first row of every burst survives.

    I left the commented SELECT statement because you should always check what is going to be deleted before you do the deed.

    Solution for non-sequential IDs:

    If your RDBMS supports CTEs and window functions (like PostgreSQL, Oracle, SQL Server, ... but not SQLite prior to v3.25, MS Access or MySQL prior to v8.0.1), there is an elegant way:

    WITH cte AS (
        SELECT *, row_number() OVER (ORDER BY id) AS rn
        FROM   tbl
        )
    SELECT id, col1
    FROM   cte c
    WHERE  NOT EXISTS (
        SELECT *
        FROM   cte
        WHERE  col1 = c.col1
        AND    rn   = c.rn - 1
        )
    ORDER  BY id;
    

    Another way doing the job without those niceties (should work for you):

    SELECT id, col1
    FROM   tbl t
    WHERE  (
        SELECT col1 = t.col1
        FROM   tbl
        WHERE  id < t.id
        ORDER  BY id DESC
        LIMIT  1) IS NOT TRUE
    ORDER  BY id;
    

提交回复
热议问题