What is a good definition for language code and locale codes?

前端未结

关注

 4  1418

When to use en_GB and en-GB ?

What is the difference ?

Is there an ISO name for this ISO 639-1 (language) and

ISO 3166


                      
              相关标签:


      
      
        
          4条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  盖世英雄少女心        
                
              
                            
                2021-01-31 09:27
              
            
            
                                                                       
It depends on technology. For example in Java Locale.UK will give you en_GB code (if you care enough to call toString()). This is what you would pass between modules (unless you are passing concrete type) and this is what you would write into configuration files (i.e. faces-context.xml).

In .Net on the other hand, you would certainly use en-GB.

en-GB form is definitely more common and in most cases this is the form you should use.

The different is obvious: the separator :) Otherwise there is no difference (in the meaning, specific technology might impose some constraints on Locale identifier).

There is no ISO normative document that handles language and country combination, per my knowledge. In Software Internationalization it is part of Locale Model.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  谎友^        
                
              
                            
                2021-01-31 09:37
              
            
            
                                                                       
A locale is a combination of language and region (usually a country).

The separator ca be _ or -, but the recommended one is dash.

Probably you are looking for BCP-47 standard that make use of language codes from ISO 639-1 and region/country codes from ISO 3166-1 alpha-2 (usually written in upper case).

You can find more information about them here http://blog.i18n.ro/simplified-locale-codes/
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  梦毁少年i        
                
              
                            
                2021-01-31 09:41
              
            
            
                                                                       
There are several systems for locale identifiers. Many of them are similar at the first glance, but not when you go deeper:

Some examples (Serbian-Serbia with Latin Script, Japanese-Japan with radical sorting):


UTS-35, ICU, Mac OS X, Flash: sr-Latn-RS, ja-JP@collation=radical
Newer UTS-35, BCP 47 extension U: sr-Latn-RS, ja-JP-u-co-unihan
Win 2000, XP: 0x81a, 0x10411
Vista, Win 7: sr-Latn-CS, ja-JP_radical
Java: sr_CS, ja_JP
Java 7: sr_RS, ja_JP
Linux: sr_RS@latin, ja_JP.utf8


Think of it like different ways to talk about colors (RGB, CMYB, HSV, Pantone, etc.)

So - vs. _ does not make sense unless you specify what the is the environment you are using. Use - and Java will not understand it, use _ and Windows will not understand it.
ICU (and systems build on top of it) accept both - and _, but produce the _ style.

There is no ISO that covers the combination of language-country. But there are ISOs that cover the various parts (language, country, script).
The exact version of the ISO also depends on the system used for locale identifiers.



In general you should accept both _ and -, and generate only one ("be liberal in what you accept and strict in what you emit") (like ICU).

If you communicate with systems using another type of locale identifier, you will have to map to/from your system. That will force you to use _ or -.
Some of the mappings will be lossy (there is no way to specify alternate calendars in Windows, Linux; or alternate sorting or scripts in Java older than 7, etc.) and round-tripping might not be possible (somewhat similar to conversions RGB-CMYK).

Addition: things are different not only between systems, but they can change in time. For instance Java 7 added support for sr_RS and for scripts, Windows keeps adding support for more locales, new countries get created (Sudan split, Russia, Serbia) or disappear (East Germany, U.S.S.R, Yugoslavia) and so on.

For internal representation you might want to choose the most powerful one, that can represent everything, and that is UTS-35 / BCP 47 (also used by CLDR and ICU).
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  长发绾君心        
                
              
                            
                2021-01-31 09:44
              
            
            
                                                                       
It's covered for the Internet in RFC 3066 and denotes "en-GB" not "en_GB"
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复