Adding an identity to an existing column

后端 未结 19 2051
温柔的废话
温柔的废话 2020-11-21 13:16

I need to change the primary key of a table to an identity column, and there\'s already a number of rows in table.

I\'ve got a script to clean up the IDs to ensure

19条回答
  •  小蘑菇
    小蘑菇 (楼主)
    2020-11-21 13:58

    In SQL 2005 and above, there's a trick to solve this problem without changing the table's data pages. This is important for large tables where touching every data page can take minutes or hours. The trick also works even if the identity column is a primary key, is part of a clustered or non-clustered index, or other gotchas which can trip up the the simpler "add/remove/rename column" solution.

    Here's the trick: you can use SQL Server's ALTER TABLE...SWITCH statement to change the schema of a table without changing the data, meaning you can replace a table with an IDENTITY with an identical table schema, but without an IDENTITY column. The same trick works to add IDENTITY to an existing column.

    Normally, ALTER TABLE...SWITCH is used to efficiently replace a full partition in a partitioned table with a new, empty partition. But it can also be used in non-partitioned tables too.

    I've used this trick to convert, in under 5 seconds, a column of a of a 2.5 billion row table from IDENTITY to a non-IDENTITY (in order to run a multi-hour query whose query plan worked better for non-IDENTITY columns), and then restored the IDENTITY setting, again in less than 5 seconds.

    Here's a code sample of how it works.

     CREATE TABLE Test
     (
       id int identity(1,1),
       somecolumn varchar(10)
     );
    
     INSERT INTO Test VALUES ('Hello');
     INSERT INTO Test VALUES ('World');
    
     -- copy the table. use same schema, but no identity
     CREATE TABLE Test2
     (
       id int NOT NULL,
       somecolumn varchar(10)
     );
    
     ALTER TABLE Test SWITCH TO Test2;
    
     -- drop the original (now empty) table
     DROP TABLE Test;
    
     -- rename new table to old table's name
     EXEC sp_rename 'Test2','Test';
    
     -- update the identity seed
     DBCC CHECKIDENT('Test');
    
     -- see same records
     SELECT * FROM Test; 
    

    This is obviously more involved than the solutions in other answers, but if your table is large this can be a real life-saver. There are some caveats:

    • As far as I know, identity is the only thing you can change about your table's columns with this method. Adding/removing columns, changing nullability, etc. isn't allowed.
    • You'll need to drop foriegn keys before you do the switch and restore them after.
    • Same for WITH SCHEMABINDING functions, views, etc.
    • new table's indexes need to match exactly (same columns, same order, etc.)
    • Old and new tables need to be on the same filegroup.
    • Only works on SQL Server 2005 or later
    • I previously believed that this trick only works on the Enterprise or Developer editions of SQL Server (because partitions are only supported in Enterprise and Developer versions), but Mason G. Zhwiti in his comment below says that it also works in SQL Standard Edition too. I assume this means that the restriction to Enterprise or Developer doesn't apply to ALTER TABLE...SWITCH.

    There's a good article on TechNet detailing the requirements above.

    UPDATE - Eric Wu had a comment below that adds important info about this solution. Copying it here to make sure it gets more attention:

    There's another caveat here that is worth mentioning. Although the new table will happily receive data from the old table, and all the new rows will be inserted following a identity pattern, they will start at 1 and potentially break if the said column is a primary key. Consider running DBCC CHECKIDENT('') immediately after switching. See msdn.microsoft.com/en-us/library/ms176057.aspx for more info.

    If the table is actively being extended with new rows (meaning you don't have much if any downtime between adding IDENTITY and adding new rows, then instead of DBCC CHECKIDENT you'll want to manually set the identity seed value in the new table schema to be larger than the largest existing ID in the table, e.g. IDENTITY (2435457, 1). You might be able to include both the ALTER TABLE...SWITCH and the DBCC CHECKIDENT in a transaction (or not-- haven't tested this) but seems like setting the seed value manually will be easier and safer.

    Obviously, if no new rows are being added to the table (or they're only added occasionally, like a daily ETL process) then this race condition won't happen so DBCC CHECKIDENT is fine.

提交回复
热议问题