MySQL does not treat ı as i?

前端未结

关注

 1  363

I have a user table in MySQL 5.7.27 with utf8mb4_unicode_ci collation.

Unfortunately, ı is not threaded as i for example, the below query won\'t find


                      
              相关标签:


      
      
        
          1条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  渐次进展        
                
              
                            
                2021-01-28 03:33
              
            
            
                                                                       
Referring to http://mysql.rjweb.org/utf8_collations.html , I see that ı=i in 3 collations:  utf8_general_ci, utf8_general_mysql500_ci, utf8_turkish_ci.  However, for the turkish collation, I=ı sorts before other accented I's.  In all other collations ı sorts after all I's, as if it is treated as a separate letter.

Meanwhile İ=I in all collations except utf8_turkish_ci.

The plot thickens with MySQL 8.0.  utf8mb4_tr_0900_ai_ci (only) has this ordering:

I=Ì=Í=Î=Ï=Ĩ=Ī=Ĭ=Į=ı sort before  i=ì=í=î=ï=ĩ=ī=ĭ=į=İ


Meanwhile ä=Ä and they match most other accented A's for most collations (including the Turkish ones).

Bottom line:  It seems that utf8[mb4]_general_ci is the only collation in 5.7 or 8.0 that will always treat a dotless-i (or dotted-I) equal to a 'regular i/I and at the same time ignore umlauts.

Caveat:  The "general" collations do not test more than one character at a time.  That is, a "non-spacing umlaut" plus a vowel will not be treated as equal to the combination.

In that link...  The one character æ is sorted the same as the two letters ae for some collations.  That's indicated by:  Aa  ae=æ  az.  In about half of the other collations, the character  æ  is treated as a separate letter; this is indicated by it being after az and before b.  Or even after zz for Scandinavian collations.  This separate letter concept sometimes applies to letter pairs, for example cs (Hungarian) and ch (traditional Spanish).
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复