how to display and input chinese (and other non-ASCII) character in r console?

前端未结

关注

 1  1365

难免孤独 2021-02-06 00:45

My system: win7 ultimate 64 english version + r-3.1(64) .
Here is my sessionInfo.

> sessionInfo()
R version 3.1.0 (2014-04-10)
Platform: x86_64-w64-mingw3


      
      
        
          1条回答        

        
                    
            
            
                         
                
              
              
                
                   忘了有多久
                                             
                
                
                (楼主)
            
              
              
                2021-02-06 00:58
              

            
            
                        
It is probably not very well documented, but you want to use setlocale in order to use Chinese. And the method applies to many other languages as well. The solution is not obvious as the official document of setlocale didn't specifically mentioned it as a method to solve the display issues.

> print('ÊÔÊÔ') #试试, meaning let's give it a shot in Chinese
[1] "ÊÔÊÔ" #won't show up correctly
> Sys.getlocale()
[1] "LC_COLLATE=English_United States.1252;LC_CTYPE=English_United States.1252;LC_MONETARY=English_United States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252"
> Sys.setlocale(category = "LC_ALL", locale = "chs") #cht for traditional Chinese, etc.
[1] "LC_COLLATE=Chinese_People's Republic of China.936;LC_CTYPE=Chinese_People's Republic of China.936;LC_MONETARY=Chinese_People's Republic of China.936;LC_NUMERIC=C;LC_TIME=Chinese_People's Republic of China.936"
> print('试试')
[1] "试试"
> read.table("c:/CHS.txt",sep=" ") #Chinese: the 1st record/observation
  V1   V2  V3 V4  V5   V6
1 122 第一 122 条 122 记录 


If you just want to change the display encoding, without changing other aspects of locales, use LC_CTYPE instead of LC_ALL:

> Sys.setlocale(category = "LC_CTYPE", locale = "chs")
[1] "Chinese_People's Republic of China.936"
> print('试试')
[1] "试试"


Now, of course this only applies to the official R console. If you use other IDE's, such as the very popular RStudio, you don't need to do this at all to be able to type and display Chinese, even if you didn't have the Chinese locale loaded.

Migrate some useful stuff from the following comments:

If the data still fails to show up correctly, the we should also look into the issue of the file encoding. If the file is UTF-8 encoded, tither data <- read.table("you_file", sep=',', fileEncoding="UTF-8-BOM", header=TRUE) or fileEncoding="UTF-8" will do, depends on which encoding it really has. 

But you may want to stay away from UTF-BOM as it is not recommended: What's different between UTF-8 and UTF-8 without BOM?
    
             
                                                        
            
            
              
                
                0
              
                   
                
               讨论(0)
              
                                                  
              
              
                          
             
       
          
              
                                    
                         
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
                              			
        
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复