String literal to basic_string

后端未结

关注

 2  1126

When it comes to internationalization & Unicode, I\'m an idiot American programmer. Here\'s the deal.

#include 
using namespace std;

typedef


                      
              相关标签:


      
      
        
          2条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  情歌与酒        
                
              
                            
                2021-01-22 04:34
              
            
            
                                                                       
Using different character types for a different encodings has the advantages that the compiler barks at you when you mess them up. The downside is, you have to manually convert. 

A few helper functions to the rescue: 

inline ustring convert(const std::string& sys_enc) {
  return ustring( sys_enc.begin(), sys_enc.end() );
}

template< std::size_t N >
inline ustring convert(const char (&array)[N]) {
  return ustring( array, array+N );
}

inline ustring convert(const char* pstr) {
  return ustring( reinterpret_cast<const ustring::value_type*>(pstr) );
}


Of course, all these fail silently and fatally when the string to convert contains anything other than ASCII. 
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  暖寄归人        
                
              
                            
                2021-01-22 04:53
              
            
            
                                                                       
Narrow string literals are defined to be const char and there aren't unsigned string literals[1], so you'll have to cast:

ustring s = reinterpret_cast<const unsigned char*>("Hello, UTF-8");


Of course you can put that long thing into an inline function:

inline const unsigned char *uc_str(const char *s){
  return reinterpret_cast<const unsigned char*>(s);
}

ustring s = uc_str("Hello, UTF-8");


Or you can just use basic_string<char> and get away with it 99.9% of the time you're dealing with UTF-8.

[1] Unless char is unsigned, but whether it is or not is implementation-defined, blah, blah.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复