Postgres upper function on turkish character does not return expected result

后端未结

关注

 3  1011

醉话见心 2021-02-04 15:08

It looks like postgres upper/lower function does not handle select characters in Turkish character set.

select upper(\'Aaı\'), lower(\'Aaİ\') from


      
      
        
          3条回答        

        
                    
            
            
                         
                
              
              
                
                   梦如初夏
                                             
                
                
                (楼主)
            
              
              
                2021-02-04 15:24
              

            
            
                        
This is indeed bug in PostgreSQL (still not fixed, even in current git tree).
Proof: https://github.com/postgres/postgres/blob/master/src/port/pgstrcasecmp.c

PostgreSQL developers even mention specifically those Turkish characters there:


  SQL99 specifies Unicode-aware case normalization, which we don't yet
  have the infrastructure for. Instead we use tolower() to provide a
  locale-aware translation.
  However, there are some locales where this is not right either (eg, Turkish may do strange things with 'i' and 'I').
  Our current compromise is to use tolower() for characters with
  the high bit set, and use an ASCII-only downcasing for 7-bit
  characters.


pg_upper() implemented in this file is extremely simplistic (as its companion pg_tolower()):

unsigned char
pg_toupper(unsigned char ch)
{
    if (ch >= 'a' && ch <= 'z')
            ch += 'A' - 'a';
    else if (IS_HIGHBIT_SET(ch) && islower(ch))
            ch = toupper(ch);
    return ch;
}


As you can see, this code does not treat its parameter as Unicode code point, and cannot possibly work 100% correctly, unless currently selected locale happens to be the one that we care for (like Turkish non-unicode locale) and OS-provided non-unicode toupper() is working correctly.

This is really sad, I just hope that this will be solved in upcoming PostgreSQL releases...
    
             
                                                        
            
            
              
                
                0
              
                   
                
               讨论(0)
              
                                                  
              
              
                          
             
       
          
              
                                       
     查看其它3个回答


            
                         
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
                              			
        
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复