Search utf-8 string with Gmail X-GM-RAW IMAP command

限于喜欢 提交于 2020-01-02 15:14:00

问题


Gmail's imap extension command X-GM-RAW allows me to perform a search if I use a ascii query string. If utf-8 chars are used in the query, the imap returns bad response.

https://developers.google.com/google-apps/gmail/imap_extensions#extension_of_the_search_command_x-gm-raw

How should the utf-8 input string be encoded so that X-GM-RAW search will work fine. I do not want to loose the flexibility to search specific field like "subject" or "rfc833msgid"

Thanks


回答1:


Specify CHARSET UTF-8 and send the UTF-8 search term in a literal. For example, to search for 你好, which is 6 bytes long when encoded in UTF-8:

A SEARCH CHARSET UTF-8 X-GM-RAW {6}
+ go ahead
你好
* SEARCH 15
a OK SEARCH completed (Success)

In this example you would actually send the 6-byte UTF-8 encoding of 你好 on the third line.

This will work for any SEARCH keyword that accepts an astring, including SUBJECT and HEADER MESSAGE-ID.




回答2:


IMAP isn't 8-bit clean, so it has to use a variety of different encodings to represent any 8-bit data.

For things like folders and labels IMAP4 uses Modified UTF-7 to represent these characters. Conveniently, ascii data encoded in modified utf7 encodes as itself, so normally nothing special needs to be done.

For message headers (including subjects) the text is encoded as Mime words.

And finally atttachments are generally encoded as either Base64 or Quoted-Printable

My best guess is that GMail uses modified utf7 for their X-GM-RAW queries. The best reference implementation for modified utf7 I've found is in the IMAPClient python library

Hope this helps!



来源:https://stackoverflow.com/questions/11517375/search-utf-8-string-with-gmail-x-gm-raw-imap-command

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!