UnicodeDecodeError with Django's request.FILES

后端未结

关注

 4  1537

I have the following code in the view call..

def view(request):
    body = u\"\"  
    for filename, f in request.FILES.items():
        body = body + \'Filename


                      
              相关标签:


      
      
        
          4条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  执念已碎        
                
              
                            
                2021-02-09 05:01
              
            
            
                                                                       
If you are not in control of the file encoding for files that can be uploaded , you can guess what encoding a file is in using the Universal Encoding Detector module chardet.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  忘掉有多难        
                
              
                            
                2021-02-09 05:05
              
            
            
                                                                       
Django has some utilities that handle this (smart_unicode, force_unicode, smart_str).  Generally you just need smart_unicode.

from django.utils.encoding import smart_unicode
def view(request):
    body = u""  
    for filename, f in request.FILES.items():
        body = body + 'Filename: ' + filename + '\n' + smart_unicode(f.read()) + '\n'

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  野的像风        
                
              
                            
                2021-02-09 05:09
              
            
            
                                                                       
you are appending f.read() directly to unicode string, without decoding it, if the data you are reading from file is utf-8 encoded use utf-8, else use whatever encoding it is in.

decode it first and then append to body e.g.

data = f.read().decode("utf-8")
body = body + 'Filename: ' + filename + '\n' + data + '\n'

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  Happy的楠姐        
                
              
                            
                2021-02-09 05:21
              
            
            
                                                                       
Anurag's answer is correct. However another problem here is you can't for certain know the encoding of the files that users upload. It may be useful to loop over a tuple of the most common ones till you get the correct one:

encodings = ('windows-xxx', 'iso-yyy', 'utf-8',)
for e in encodings:
    try:
        data = f.read().decode(e)
        break
    except UnicodeDecodeError:
        pass

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复