Python: UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 0: invalid start byte

后端未结

关注

 3  572

I am fetching data from a catalog and it\'s giving data in bytes format.

Bytes data:

b\'\\x80\\x00\\x00\\x00\\n\\x00\\x00%\\x83\\xa0\\x08\\x01\\x00\\


                      
              相关标签:


      
      
        
          3条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  一向        
                
              
                            
                2021-01-15 04:23
              
            
            
                                                                       
For this encoding error
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte

or other like that, you just have to open the database file with .json extension and change the encoding to UTF-8 (for exemple in VScode, you can change it in right-bottom nav-bar) and save the file...
Now run
 $ git status

you'll have something like this result
 On branch master
 Changes not staged for commit:
   (use "git add <file>..." to update what will be committed)
   (use "git restore <file>..." to discard changes in working directory)
        modified:   store/dumps/store.json
   (use "git add <file>..." to include in what will be committed)
        .gitignore

 no changes added to commit (use "git add" and/or "git commit -a")

or something like this one
On branch master
Changes to be committed:
  (use "git restore --staged <file>..." to unstage)
        modified:   store/dumps/store.json
Untracked files:
  (use "git add <file>..." to include in what will be committed)
        .gitignore

for the first case, you just have to do this one
$ git add store/dumps/

the second case don't need this previous part...
Now, for the two cases, you have to commit the changes with
$ git commit -m "launching to production"

the console will return you a message informed you for the adds and changes...
You have to build log for the app again with
$ git push heroku master

(for heroku users)
after the build, you just have to load the database again with
heroku run python manage.py loaddata store/dumps/store.json

it will install the objects./.
excuses for my english level !!!
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  一向        
                
              
                            
                2021-01-15 04:30
              
            
            
                                                                       
You can try ignoring the non-readable blocks.

blobs.decode('utf-8', 'ignore')

It's not a great solution but the way you're generating the byte object has some issues. Maybe, utf-8 is not the proper encoding for your data.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  再見小時候        
                
              
                            
                2021-01-15 04:39
              
            
            
                                                                       
The UTF-8 encoding has some built-in redundancy that serves at least two purposes:

1) locating code points reading back and forth

Start bytes (in binary dots carrying actual data) match one of these 4 patterns

0.......
110.....
1110....
11110...


whereas continuation bytes (0 to 3) have always this form

10......


2) checking for validity

If this encoding is not respected, it is safe to say that it is not UTF-8 data, e.g. because corruptions occurred during a transfer.

Concludion

Why is it possible to say that b'\x80\' cannot be UTF-8?
Already at the first two bytes the encoding is violated: because 80 must be a continuation byte. This is exactly what your error message says:


  UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 0: invalid start byte


And even if you skip this one, you get another problem some bytes later at b'%\x83', so it's most likely that either you are trying to decode the wrong data or assume the wrong encoding.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复