PostgreSQL gapless sequences

前端 未结 1 1045
轮回少年
轮回少年 2020-11-22 16:55

I\'m moving from MySql to Postgres, and I noticed that when you delete rows from MySql, the unique ids for those rows are re-used when you make new ones. With Postgres, if

相关标签:
1条回答
  • 2020-11-22 17:02

    Sequences have gaps to permit concurrent inserts. Attempting to avoid gaps or to re-use deleted IDs creates horrible performance problems. See the PostgreSQL wiki FAQ.

    PostgreSQL SEQUENCEs are used to allocate IDs. These only ever increase, and they're exempt from the usual transaction rollback rules to permit multiple transactions to grab new IDs at the same time. This means that if a transaction rolls back, those IDs are "thrown away"; there's no list of "free" IDs kept, just the current ID counter. Sequences are also usually incremented if the database shuts down uncleanly.

    Synthetic keys (IDs) are meaningless anyway. Their order is not significant, their only property of significance is uniqueness. You can't meaningfully measure how "far apart" two IDs are, nor can you meaningfully say if one is greater or less than another. All you can do is say "equal" or "not equal". Anything else is unsafe. You shouldn't care about gaps.

    If you need a gapless sequence that re-uses deleted IDs, you can have one, you just have to give up a huge amount of performance for it - in particular, you cannot have any concurrency on INSERTs at all, because you have to scan the table for the lowest free ID, locking the table for write so no other transaction can claim the same ID. Try searching for "postgresql gapless sequence".

    The simplest approach is to use a counter table and a function that gets the next ID. Here's a generalized version that uses a counter table to generate consecutive gapless IDs; it doesn't re-use IDs, though.

    CREATE TABLE thetable_id_counter ( last_id integer not null );
    INSERT INTO thetable_id_counter VALUES (0);
    
    CREATE OR REPLACE FUNCTION get_next_id(countertable regclass, countercolumn text) RETURNS integer AS $$
    DECLARE
        next_value integer;
    BEGIN
        EXECUTE format('UPDATE %s SET %I = %I + 1 RETURNING %I', countertable, countercolumn, countercolumn, countercolumn) INTO next_value;
        RETURN next_value;
    END;
    $$ LANGUAGE plpgsql;
    
    COMMENT ON get_next_id(countername regclass) IS 'Increment and return value from integer column $2 in table $1';
    

    Usage:

    INSERT INTO dummy(id, blah) 
    VALUES ( get_next_id('thetable_id_counter','last_id'), 42 );
    

    Note that when one open transaction has obtained an ID, all other transactions that try to call get_next_id will block until the 1st transaction commits or rolls back. This is unavoidable and for gapless IDs and is by design.

    If you want to store multiple counters for different purposes in a table, just add a parameter to the above function, add a column to the counter table, and add a WHERE clause to the UPDATE that matches the parameter to the added column. That way you can have multiple independently-locked counter rows. Do not just add extra columns for new counters.

    This function does not re-use deleted IDs, it just avoids introducing gaps.

    To re-use IDs I advise ... not re-using IDs.

    If you really must, you can do so by adding an ON INSERT OR UPDATE OR DELETE trigger on the table of interest that adds deleted IDs to a free-list side table, and removes them from the free-list table when they're INSERTed. Treat an UPDATE as a DELETE followed by an INSERT. Now modify the ID generation function above so that it does a SELECT free_id INTO next_value FROM free_ids FOR UPDATE LIMIT 1 and if found, DELETEs that row. IF NOT FOUND gets a new ID from the generator table as normal. Here's an untested extension of the prior function to support re-use:

    CREATE OR REPLACE FUNCTION get_next_id_reuse(countertable regclass, countercolumn text, freelisttable regclass, freelistcolumn text) RETURNS integer AS $$
    DECLARE
        next_value integer;
    BEGIN
        EXECUTE format('SELECT %I FROM %s FOR UPDATE LIMIT 1', freelistcolumn, freelisttable) INTO next_value;
        IF next_value IS NOT NULL THEN
            EXECUTE format('DELETE FROM %s WHERE %I = %L', freelisttable, freelistcolumn, next_value);
        ELSE
            EXECUTE format('UPDATE %s SET %I = %I + 1 RETURNING %I', countertable, countercolumn, countercolumn, countercolumn) INTO next_value;
        END IF;
        RETURN next_value;
    END;
    $$ LANGUAGE plpgsql;
    
    0 讨论(0)
提交回复
热议问题