Python: UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 0: invalid start byte

后端 未结 3 572
梦如初夏
梦如初夏 2021-01-15 04:11

I am fetching data from a catalog and it\'s giving data in bytes format.

Bytes data:

b\'\\x80\\x00\\x00\\x00\\n\\x00\\x00%\\x83\\xa0\\x08\\x01\\x00\\         


        
相关标签:
3条回答
  • 2021-01-15 04:23

    For this encoding error

    UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte
    

    or other like that, you just have to open the database file with .json extension and change the encoding to UTF-8 (for exemple in VScode, you can change it in right-bottom nav-bar) and save the file...

    Now run

     $ git status
    

    you'll have something like this result

     On branch master
     Changes not staged for commit:
       (use "git add <file>..." to update what will be committed)
       (use "git restore <file>..." to discard changes in working directory)
            modified:   store/dumps/store.json
       (use "git add <file>..." to include in what will be committed)
            .gitignore
    
     no changes added to commit (use "git add" and/or "git commit -a")
    

    or something like this one

    On branch master
    Changes to be committed:
      (use "git restore --staged <file>..." to unstage)
            modified:   store/dumps/store.json
    Untracked files:
      (use "git add <file>..." to include in what will be committed)
            .gitignore
    

    for the first case, you just have to do this one

    $ git add store/dumps/
    

    the second case don't need this previous part...

    Now, for the two cases, you have to commit the changes with

    $ git commit -m "launching to production"
    

    the console will return you a message informed you for the adds and changes...

    You have to build log for the app again with

    $ git push heroku master
    

    (for heroku users)

    after the build, you just have to load the database again with

    heroku run python manage.py loaddata store/dumps/store.json
    

    it will install the objects./.

    excuses for my english level !!!

    0 讨论(0)
  • 2021-01-15 04:30

    You can try ignoring the non-readable blocks.

    blobs.decode('utf-8', 'ignore')

    It's not a great solution but the way you're generating the byte object has some issues. Maybe, utf-8 is not the proper encoding for your data.

    0 讨论(0)
  • 2021-01-15 04:39

    The UTF-8 encoding has some built-in redundancy that serves at least two purposes:

    1) locating code points reading back and forth

    Start bytes (in binary dots carrying actual data) match one of these 4 patterns

    0.......
    110.....
    1110....
    11110...
    

    whereas continuation bytes (0 to 3) have always this form

    10......
    

    2) checking for validity

    If this encoding is not respected, it is safe to say that it is not UTF-8 data, e.g. because corruptions occurred during a transfer.

    Concludion

    Why is it possible to say that b'\x80\' cannot be UTF-8? Already at the first two bytes the encoding is violated: because 80 must be a continuation byte. This is exactly what your error message says:

    UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 0: invalid start byte

    And even if you skip this one, you get another problem some bytes later at b'%\x83', so it's most likely that either you are trying to decode the wrong data or assume the wrong encoding.

    0 讨论(0)
提交回复
热议问题