Receiving Email Attachments in App Engine Python errors on Unicode Text File

后端 未结 2 1994
情书的邮戳
情书的邮戳 2021-01-15 03:51

I have some code to parse an email and find the attachments then store them into the Datastore as db.BlobProperties (might change that to Blobstore later). The problem is t

相关标签:
2条回答
  • 2021-01-15 04:33

    Check if this code helps:

    ===========================

       my_file = []
        my_list = []
        if hasattr(mail_message, 'attachments'):
            file_name = ""
            for filename, filecontents in mail_message.attachments:
                file_name = filename
                file_blob = filecontents.payload
                file_blob = file_blob.decode(filecontents.encoding)
                my_file.append(file_name)
                my_list.append(str(store_file(self, file_name, file_blob)))
    
    0 讨论(0)
  • 2021-01-15 04:35

    Attachment payloads are instances of the EncodedPayload class. Attachments have an encoding and an optional character set. The former refers to transfer encodings such as base64; the latter to character encodings such as UTF-8 (character set's a bit of an outdated and misleading term here). The EncodedPayload.decode() method decodes both transfer encoding and text encoding, which as you've noticed is not very helpful if you just want to get the original bytes the user attached to their message.

    There's a number of approaches you can take here, but what I'd recommend is duplicating EncodedPayload's logic for decoding transfer encoding, which looks something like this:

    if filecontents.encoding and filecontents.encoding.lower() != '7bit':
      try:
        payload = filecontents.payload.decode(filecontents.encoding)
      except LookupError:
        raise UnknownEncodingError('Unknown decoding %s.' % filecontents.encoding)
      except (Exception, Error), e:
        raise PayloadEncodingError('Could not decode payload: %s' % e)
    else:
      payload = filecontents.payload
    

    Note that if the attachment was text, you need to include the character encoding when you store it, or there'll be no way to interpret it when you send it back to the user - the original text could have been encoded using any character encoding.

    Likewise, you should also save the mimetype of the attachment if you can, but this doesn't appear to be exposed anywhere in the API. You might want to consider avoiding using the IncomingMessage class at all, and instead decoding the body of the POST request using Python's mime message module.

    0 讨论(0)
提交回复
热议问题