Python can not open UTF-8 encoded text file

前端未结

关注

 3  849

长发绾君心 2021-01-17 00:48

I have .py script which contains following code to open specific text file (which was generated by Exchange Powershell):

with codecs.open(\"C:\\\\Temp\\\\myf


      
      
        
          3条回答        

        
                    
            
            
                         
                
              
              
                
                   南笙
                                             
                
                
                (楼主)
            
              
              
                2021-01-17 01:16
              

            
            
                        
First, this text is definitely not UTF-8, so that's why Python can't open it as a UTF-8-encoded text file.

Second, you claim you "tried also utf-16-be and utf-16-le", but didn't show how you did that, and I suspect you did it wrong. 

From the output, this is very likely BOM-encoded UTF-16-LE.

The first two bytes—because of the way you've printed them, we can't tell which bytes they are, but this is what it looks like when you print out \xFF and \xFE bytes. And the rest of the strings are a bunch of NUL even bytes alternating with reasonable-looking bytes, which almost always means UTF-16-LE. Plus, most common two-byte with a BOM in the wild is UTF-16-LE, and the fact that you're using all Microsoft tools makes that even more likely.

So, if you'd really tried utf-16-le, you would almost certainly have gotten the right string, but with an extra \ufeff at the start.

But of course the right answer is to just decode it as 'utf-16', which will consume and use the BOM properly.
    
             
                                                        
            
            
              
                
                0
              
                   
                
               讨论(0)
              
                                                  
              
              
                          
             
       
          
              
                                       
     查看其它3个回答


            
                         
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
                              			
        
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复