Python: UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 0: invalid start byte

后端未结

关注

 3  570

梦如初夏 2021-01-15 04:11

I am fetching data from a catalog and it\'s giving data in bytes format.

Bytes data:

b\'\\x80\\x00\\x00\\x00\\n\\x00\\x00%\\x83\\xa0\\x08\\x01\\x00\\


      
      
        
          3条回答        

        
                    
            
            
                         
                
              
              
                
                   再見小時候
                                             
                
                
                (楼主)
            
              
              
                2021-01-15 04:39
              

            
            
                        
The UTF-8 encoding has some built-in redundancy that serves at least two purposes:

1) locating code points reading back and forth

Start bytes (in binary dots carrying actual data) match one of these 4 patterns

0.......
110.....
1110....
11110...


whereas continuation bytes (0 to 3) have always this form

10......


2) checking for validity

If this encoding is not respected, it is safe to say that it is not UTF-8 data, e.g. because corruptions occurred during a transfer.

Concludion

Why is it possible to say that b'\x80\' cannot be UTF-8?
Already at the first two bytes the encoding is violated: because 80 must be a continuation byte. This is exactly what your error message says:


  UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 0: invalid start byte


And even if you skip this one, you get another problem some bytes later at b'%\x83', so it's most likely that either you are trying to decode the wrong data or assume the wrong encoding.
    
             
                                                        
            
            
              
                
                0
              
                   
                
               讨论(0)
              
                                                  
              
              
                          
             
       
          
              
                                       
     查看其它3个回答


            
                         
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
                              			
        
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复