UnicodeDecodeError with Django's request.FILES

后端 未结 4 1528
轻奢々
轻奢々 2021-02-09 04:41

I have the following code in the view call..

def view(request):
    body = u\"\"  
    for filename, f in request.FILES.items():
        body = body + \'Filename         


        
相关标签:
4条回答
  • 2021-02-09 05:01

    If you are not in control of the file encoding for files that can be uploaded , you can guess what encoding a file is in using the Universal Encoding Detector module chardet.

    0 讨论(0)
  • 2021-02-09 05:05

    Django has some utilities that handle this (smart_unicode, force_unicode, smart_str). Generally you just need smart_unicode.

    from django.utils.encoding import smart_unicode
    def view(request):
        body = u""  
        for filename, f in request.FILES.items():
            body = body + 'Filename: ' + filename + '\n' + smart_unicode(f.read()) + '\n'
    
    0 讨论(0)
  • 2021-02-09 05:09

    you are appending f.read() directly to unicode string, without decoding it, if the data you are reading from file is utf-8 encoded use utf-8, else use whatever encoding it is in.

    decode it first and then append to body e.g.

    data = f.read().decode("utf-8")
    body = body + 'Filename: ' + filename + '\n' + data + '\n'
    
    0 讨论(0)
  • 2021-02-09 05:21

    Anurag's answer is correct. However another problem here is you can't for certain know the encoding of the files that users upload. It may be useful to loop over a tuple of the most common ones till you get the correct one:

    encodings = ('windows-xxx', 'iso-yyy', 'utf-8',)
    for e in encodings:
        try:
            data = f.read().decode(e)
            break
        except UnicodeDecodeError:
            pass
    
    0 讨论(0)
提交回复
热议问题