What is the fastest way to look for duplicate uniqueidentifier in Sql Server?

问题

We use uniqueidentifier for every record within a very large database. For business reasons we need to ensure that the uniqueidentifier is never used more than once, but for performance reasons we have a bigint as the primary key.

What is the fastest way to test the existence of a uniqueidentifer in a Sql server table?

回答1:

< 0.05ms to validate a uniqueidentifier from 100,000,000 rows on a single Standard S0 Sql Azure instance.

DISCLAIMER: The following aproach may require tweaking to your business requirements, and currently has only been run in a Sql Azure staging environment on an S0 instance (10 DTU's). The goal being for proof of concept.

(See below for boilerplate CREATE To scripts)

Method: Create a table specifically for storing any uniqueidentifiers that exist within your database. This table will be heavily optimised for the single purpose of augmenting our uniqueidentifer validator.

The table will have four columns. One for the uniqueidentifier, and the other three as single digit binary computed columns for storing each of the three digits of the uniqueidentifier.

We'll then create a clustered index on the first three digit columns, and the uniqueidentifer as the last column of the clustered index.

We'll lastly create a stored procedure that'll take a uniqueidentifer and break out the first three digits as binary datatype and perform the lookup, taking advantage of the B-tree structure of the data on the disk as per our clustered index.

Table CREATE To (with clustered index):

USE [MyDatabase]
GO

SET ANSI_NULLS ON
GO

SET QUOTED_IDENTIFIER ON
GO

SET ARITHABORT ON
GO

SET ANSI_PADDING ON
GO

CREATE TABLE [dbo].[UniqueIds](
    [Guid] [uniqueidentifier] NOT NULL,
    [Char01]  AS (CONVERT([char](1),substring(CONVERT([char](36),[Guid]),(1),(1)))) PERSISTED NOT NULL,
    [Char02]  AS (CONVERT([char](1),substring(CONVERT([char](36),[Guid]),(2),(1)))) PERSISTED NOT NULL,
    [Char03]  AS (CONVERT([char](1),substring(CONVERT([char](36),[Guid]),(3),(1)))) PERSISTED NOT NULL,
 CONSTRAINT [PK_UniqueIds] PRIMARY KEY CLUSTERED 
(
    [Char01] ASC,
    [Char02] ASC,
    [Char03] ASC,
    [Guid] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
)

GO

SET ANSI_PADDING OFF
GO

Stored procedure for validation:

CREATE PROCEDURE [dbo].[UniqueIds.CountIds]
    @Guid uniqueidentifier,
    @IdCount bigint OUTPUT
AS
    -- SET NOCOUNT ON added to prevent extra result sets from
    -- interfering with SELECT statements.
    SET NOCOUNT ON;

    -- Break out uniqueidentifier down into the first three characters for indexed lookup.
    DECLARE @SubChar char(3) = CONVERT([char](36),@Guid);
    DECLARE @Char01 char(1) = @SubChar;
    DECLARE @Char02 char(1) = SUBSTRING(@SubChar,2,1);
    DECLARE @Char03 char(1) = RIGHT(@SubChar,1);

    -- Check if GUID already exists
    (SELECT TOP 1 @IdCount=1 FROM UniqueIds WHERE Char01=@Char01 AND Char02=@Char02 AND Char03=@Char03 AND [Guid]=@Guid);

Stored procedure for INSERTS:

CREATE PROCEDURE [dbo].[UniqueIds.Insert]
    @Guid uniqueidentifier
AS
BEGIN TRY
    BEGIN TRAN
    -- SET NOCOUNT ON added to prevent extra result sets from
    -- interfering with SELECT statements.
    SET NOCOUNT ON;

    -- Check if GUID already exists
    DECLARE @IdCount bigint;
    EXEC    [UniqueIds.CountIds]
            @Guid=@Guid,
            @IdCount = @IdCount OUTPUT

    IF @IdCount IS NULL
        INSERT INTO UniqueIds
        (
            [Guid]
        )
        VALUES
        (
            @Guid
        )
    ELSE
        THROW 60000, '[Guid] must be unique. Another unique identifier with the same signature exists.', 1;

    COMMIT

END TRY
BEGIN CATCH
    IF @@TRANCOUNT > 0
        ROLLBACK;
    THROW;
END CATCH

Inserting a new `uniqueidentifier` (example):

DECLARE @id uniqueidentifier
SET @id=NEWID()
EXEC    [UniqueIds.Insert]
        @Guid=@id

Final word on implementation:

Every time you insert a new uniqueidentifier anywhere in your database simply wrap it in a rollback transaction and call the stored procedure for inserting a uniqueidentifer. That will call our validation stored procedure and if it fails it'll throw an error. Your rollback will make sure nothing else is persisted.

Example:

CREATE PROCEDURE [dbo].[Bases.Insert]
    @Guid uniqueidentifier,
    @AccountId bigint=0,
    @ModifierId bigint=0,
    @ScopeIdentity bigint OUTPUT
AS
**BEGIN TRY
    BEGIN TRAN**
        -- SET NOCOUNT ON added to prevent extra result sets from
        -- interfering with SELECT statements.
        SET NOCOUNT ON

        **EXEC  [dbo].[UniqueIds.Insert]
        @Guid = @Guid;**

        -- Insert the base row
        INSERT INTO Bases
        (
            [Guid],
            [State],
            Utc,
            AccountId,
            ModifierId
        )
        VALUES
        (
            @Guid,
            0,
            GETUTCDATE(),
            @AccountId,
            @ModifierId
        );

    **COMMIT;**

    SELECT @ScopeIdentity = CAST(SCOPE_IDENTITY() As bigint);

**END TRY
BEGIN CATCH
    IF @@TRANCOUNT > 0
        ROLLBACK;
    THROW;
END CATCH**

Result: validating the uniqueness of a uniqueidentifier from 100 MILLION rows consistently takes < 0.05ms

来源：https://stackoverflow.com/questions/37202746/what-is-the-fastest-way-to-look-for-duplicate-uniqueidentifier-in-sql-server

标签

sql-server

database-performance

uniqueidentifier