Enforcing “zero-or-one to one” relationship on SQL database?

后端 未结 4 2065
一整个雨季
一整个雨季 2021-01-06 11:53

I have a Post entity and a FbPost entity.

Post.FbPost is either null or a FbPost, and no two Post en

相关标签:
4条回答
  • 2021-01-06 12:28

    NULL able gives you the 0... a UNIQUE CONSTRAINT on your FK reference will give you the 1.

    0 讨论(0)
  • 2021-01-06 12:33

    Normally, you would probably attempt to have the foreign key (in the source table) be nullable and place a unique constraint on it.

    The fact that it's nullable means that you can have an null entry in the source table that does not refer to an entry in the target table. And, if it's not null, the unique constraint ensures that only one row of the source table can reference a row in the target table.

    Unfortunately, at least with SQL Server (a), NULLs in uniquely constrained columns have to be unique as well, although this "breaks" the SQL guideline that NULL is not equal to any value, including another NULL. So basically, this method won't work with SQL Server.

    One possible way out of this quandary (b) is a foreign key constraint with a nullable column, but no unique constraint. This will allow you to ensure you either don't reference a row in the target table at all (NULL in the source table) or reference a target row (any non-NULL value in the source table).

    It doesn't, however, give you your "only one source row can reference the target row" requirement. That could be added with an before-(insert/update) trigger which would check every other row in the source table to ensure no other row already references the target row.

    And you should almost always prefer constraints in the database itself. You never know when a rogue application (malignant or buggy) will connect to your database and decide not to follow the rules.


    (a) The following text, paraphrased from here, shows the differing support for nullable unique columns in several DBMS products:

    The standard:

    As the constraint name indicates, a (set of) column(s) with a UNIQUE constraint may only contain unique (combinations of) values.

    A column, or a set of columns, which is subject to a UNIQUE constraint must also be subject to a NOT NULL constraint, unless the DBMS implements an optional "NULLs allowed" feature (feature ID 591). The optional feature adds some additional characteristics to the UNIQUE constraint:

    First, columns involved in a UNIQUE constraint may also have NOT NULL constraints, but they do not have to. Secondly, if columns with UNIQUE constraints do not also have NOT NULL constraints, then the columns may contain any number of NULL 'values' (a logical consequence of the fact that NULL is not equal to NULL).

    PostgreSQL:

    Follows the standard, including the optional NULLs allowed feature.

    DB2:

    Follows the non-optional parts of the UNIQUE-constraint. Doesn't implement the optional NULLs allowed feature.

    MSSQL:

    Follows the standard with a twist.

    MSSQL offers the NULLs allowed feature, but allows at most one instance of a NULL 'value', if NULLs are allowed. In other words, it breaks characteristic 2 in the above description of the standard.

    MySQL:

    Follows the standard, including the optional NULLs allowed feature.

    Oracle:

    Follows the standard with a twist regarding multi-column UNIQUE-constraints.

    The optional NULLs allowed feature is implemented: If the UNIQUE-constraint is imposed on a single column, then the column may contain any number of NULLs (as expected from characteristic 2 in the above description of the standard). However, if the UNIQUE-constraint is specified for multiple columns, then Oracle sees the constraint as violated if any two rows

    • contain at least one NULL in a column affected by the constraint
    • identical, non-NULL values in the rest of the columns affected by the constraint

    (b) The other way out, of course, is to choose a DBMS that implements this feature, like PostgreSQL or MySQL.

    That may not be possible in your specific case but it should be at least contemplated. For example, I steer clear of Oracle because of its inability to discern NULLs from empty strings in certain character columns, though others aren't probably as "purist" (my wife would say "anal retentive") as I am :-)

    0 讨论(0)
  • 2021-01-06 12:34

    Superclass It!

    I think the best solution is to make the Post table a superclass of the FBPost table. (Note: I'm going to use PostID and FBPostID here to make it absolutely clear what I'm referring to.) That is, remove the FBPostID column from your Post table entirely, after updating the FBPost.FBPostID column to match the corresponding PostID. Instead of each FBPostID having its own unique value, it will share the same value as the PostID. In my professional opinion, this is the right way to model a one-to-zero-or-one relationship. It has a huge advantage of being foolproof, and not requiring any additional indexes, triggers, or constraints beyond what a simple FK already provides.

    Note: I'm assuming that we can just update the FBPostID column in FBPost once we drop the (presumed) PK on it. If it is an identity column, then more work will be required--just to add a new column that will become the new PK, and to rename the original column. If column order matters, the data will have to be moved to a new table in order to make a new column appear in the desired location.

    Be sure to think about concurrency while working on this so no data can be improperly read or modified while the change is taking place or before the altered data handling is updated.

    Why you would choose this

    When you're modeling an is-a relationship instead of a has-a relationship, and the two entities participate as zero-to-zero-or-one, then they should share the same surrogate key, because really they're the same entity (you're actually modeling an is-sometimes-a relationship).

    Even though changing to this database model will take a bit of work, it's totally worth it to repair a suboptimal design. Your relational database's design should always leverage the core relational functionality available. Why would you do anything else:

    • You would never write a trigger to enforce a foreign key relationship, since databases offer explicit foreign key constraints that do the job with 100% reliability.
    • You would never modify the system tables directly to create a new table, you'd issue a CREATE TABLE command.
    • And so on and so forth.

    Similarly, you would use a foreign key/primary key combination to model an is-sometimes-a relationship, without having to do weird NULL-allowing unique constraints or filtered indexes to accomplish that purpose.

    Example Change Script

    It's not even too bad! Here is some script that should get you pretty well started:

    BEGIN TRAN;
    ALTER TABLE dbo.Post DROP CONSTRAINT FK_Post_FBPostID; -- referencing FBPost
    -- Also remove all FKs from any other tables referencing FBPostID
    
    ALTER TABLE dbo.FBPost DROP CONSTRAINT PK_FBPost; -- FBPostID column not PK
    ALTER TABLE dbo.FBPost ADD OriginalFBPostID int;
    UPDATE dbo.FBPost WITH (HOLDLOCK, UPDLOCK) SET OriginalFBPostID = FBPostID;
    
    UPDATE F
    SET F.FBPostID = P.PostID
    FROM
       dbo.FBPost F
       INNER JOIN dbo.Post P
          ON F.OriginalFBPostID = P.FBPostID;
    
    -- Perform a similar update on all other tables referencing FBPostID
    
    -- Now, the two most important changes
    ALTER TABLE dbo.FBPost ADD CONSTRAINT PK_FBPost PRIMARY KEY CLUSTERED (FBPostID);
    ALTER TABLE dbo.FBPost ADD CONSTRAINT FK_FBPost_PostID FOREIGN KEY
       REFERENCES dbo.Post (PostID); -- This is where the magic happens!
    
    ALTER TABLE dbo.SomeTable ADD CONSTRAINT FK_SomeTable_FBPostID FOREIGN KEY
       REFERENCES dbo.FBPost (FBPostID); -- and all other tables referencing FBPostID
    
    EXEC sp_rename 'dbo.Post.FBPostID', 'OriginalFBPostID'; -- should stop using it
    ALTER TABLE dbo.Post DROP COLUMN FBPostID; -- Or even better, remove it.
    COMMIT TRAN;
    ALTER TABLE dbo.FBPost DROP COLUMN OriginalFBPostID; -- Meaningless now
    -- If you keep OriginalFBPostID and it is identity, please copy the values
    -- to a new non-identity column and drop it so you don't keep generating more
    

    Finally, modify your Post/FBPost insertion code to use the PostID as the FBPostID.

    New Query Appearance

    Just to drive home the point, your joins between the tables used to look like this:

    SELECT
       P.Something,
       F.SomethingElse
    FROM
       dbo.Post P
       INNER JOIN dbo.FBPost F
          ON P.FBPostID = F.FBPostID
    

    But now they will look like this:

    SELECT
       P.Something,
       F.SomethingElse
    FROM
       dbo.Post P
       INNER JOIN dbo.FBPost F
          ON P.PostID = F.FBPostID -- the important part
    

    The problem is completely solved now! Your tables even take up less space (losing the FBPostID column from Post). You don't have to monkey around with an FK that allows multiple NULLs. With a PK on FBPostID in the FBPost table it is obvious that you can only have one FBPost row per FBPostID.

    0 讨论(0)
  • 2021-01-06 12:50

    Create a unique filtered index:

    CREATE UNIQUE INDEX Post_Unq_FbPost ON dbo.Post(FbPost) WHERE FbPost IS NOT NULL;
    

    Also create a foreign key, of course.

    0 讨论(0)
提交回复
热议问题