Unique key with NULLs

前端未结

关注

 10  784

This question requires some hypothetical background. Let\'s consider an employee table that has columns name, date_of_birth, tit


                      
              相关标签:


      
      
        
          10条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  借酒劲吻你        
                
              
                            
                2020-12-03 01:05
              
            
            
                                                                       
There is a another way to do it. Adding a column(non-nullable) to represent the String value of date_of_birth column. The new column value would be ""(empty string) if date_of_birth is null.

We name the column as date_of_birth_str and create a unique constraint employee(name, date_of_birth_str). So when two recoreds come with the same name and null date_of_birth value, the unique constraint still works.

But the efforts of maintenance for the two same-meaning columns, and, the performance harm of new column, should be considered carefully.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  迷失自我        
                
              
                            
                2020-12-03 01:06
              
            
            
                                                                       
You can add a generated column where the NULL value is replaced by an unused constant, e.g. zero. Then you can apply the unique constraint to this column:

CREATE TABLE employee ( 
  name VARCHAR(50) NOT NULL, 
  date_of_birth DATE, 
  uq_date_of_birth DATE AS (IFNULL(date_of_birth, '0000-00-00')) UNIQUE
);

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  遥遥无期        
                
              
                            
                2020-12-03 01:07
              
            
            
                                                                       
Your problem of not having duplicates based on name is not solvable because you do not have a natural key. Putting a fake date in for people whose date of birth is unknown will not solve your problem. John Smith born 1900/01/01 is still going to be a differnt person than John Smithh born 1960/03/09.

I work with name data from large and small organizations every day and I can assure you they have two different people with the same name all the time. Sometimes with the same job title. Birthdate is no guarantee of uniqueness either, plenty of John Smiths born on the same date. Heck when we work with physicians office data we have often have two doctors with the same name, address and phone number (father and son combinations)

Your best bet is to have an employee ID if you are inserting employee data to identify each employee uniquely.  Then check for the uniquename in the user interface and if there are one or more matches, ask the user if he meant them  and if he says no, insert the record. Then build a deupping process to fix problems if someone gets assigned two ids by accident. 
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  感情败类        
                
              
                            
                2020-12-03 01:08
              
            
            
                                                                       
I recommend to create additional table column checksum which will contain md5 hash of name and date_of_birth. Drop unique key (name, date_of_birth) because it doesn't solve the problem. Create one unique key on checksum.

ALTER TABLE employee 
    ADD COLUMN checksum CHAR(32) NOT NULL;

UPDATE employee 
SET checksum = MD5(CONCAT(name, IFNULL(date_of_birth, '')));

ALTER TABLE employee 
    ADD UNIQUE (checksum);


This solution creates small technical overhead, cause for every inserted pairs you need to generate hash (same thing for every search query). For further improvements you can add trigger that will generate hash for you in every insert:

CREATE TRIGGER before_insert_employee 
BEFORE INSERT ON employee
FOR EACH ROW
    IF new.checksum IS NULL THEN
      SET new.checksum = MD5(CONCAT(new.name, IFNULL(new.date_of_birth, '')));
    END IF;

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  萌比男神i        
                
              
                            
                2020-12-03 01:13
              
            
            
                                                                       
In simple words,the role of Unique constraint is to make the field or column.
The null destroys this property as database treats null as unknown

Inorder to avoid duplicates and allow null:


  Make unique key as Primary key

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  醉梦人生        
                
              
                            
                2020-12-03 01:22
              
            
            
                                                                       
A fundamental property of a unique key is that
it must be unique. Making part of that key Nullable destroys this property.

There are two possible solutions to your problem:


One way, the wrong way, would be to use some magic date to represent unknown. This just gets you past
the DBMS "problem" but does not solve the problem in a logical sense.
Expect problems with two "John Smith" entries having unknown dates
of birth. Are these guys one and the same or are they unique individuals?
If you know they are different then you are back to the same old problem -
your Unique Key just isn't unique. Don't even think about assigning a whole range of magic dates
to represent "unknown" - this is truly the road to hell.
A better way is to create an EmployeeId attribute as a surrogate key. This is just an
arbitrary identifier that you assign to individuals that you know are unique. This
identifier is often just an integer value.
Then create an Employee table to relate the EmployeeId (unique, non-nullable
key) to what you believe are the dependant attributers, in this case
Name and Date of Birth (any of which may be nullable). Use the EmployeeId surrogate key everywhere that you
previously used the Name/Date-of-Birth. This adds a new table to your system but
solves the problem of unknown values in a robust manner.

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
   
          
     1
2
下一页
           
           
        
                                  
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复