PHP imap_search: UTF-8 / Non-ASCII characters on Microsoft Exchange mail servers

半世苍凉 提交于 2019-12-11 03:20:02

问题


I want to fetch emails from outlook.office365.com using IMAP and PHP.

Since the most emails contain non-ASCII characters like äöü, I use UTF-8 in my imap_search() function:

imap_search($mbox_connection, 'ALL', SE_UID, "UTF-8")

With UTF-8 and the search criteria ALL I get all emails as expected. Now, I wanted to restrict it to for example only unseen (unread) emails:

imap_search($mbox_connection, 'UNSEEN', SE_UID, "UTF-8")

But this unfortunately causes the issue, that no emails can be found anymore - although there are unseen emails - and it also throws this PHP notice:

PHP Notice:  Unknown: [BADCHARSET (US-ASCII)] The specified charset is not supported. (errflg=2) in Unknown on line 0

Based on this notice, I've changed the charset from UTF-8 to US-ASCII:

imap_search($mbox_connection, 'UNSEEN', SE_UID, "US-ASCII")

Now, it returns all expected unseen (unread) emails.

The problem is now, that I can't search for emails with UTF-8 characters. I've for example an email with these information:

  • From: Äpfel Nürnberg
  • Subject: Apfel vs. Äpfel
  • Body:
Einzahl gegen Mehrzahl.

Ein Apfel, mehrere Äpfel.

When I try to search for all emails with the subject "apfel" it works as expected - I can find the email:

imap_search($mbox_connection, 'FROM "apfel"', SE_UID, "US-ASCII")
Trying to connect to '{outlook.office365.com:993/imap/ssl}INBOX'...
Found 1 email(s)...
+------ P A R S I N G ------+
From: =?iso-8859-1?Q?=C4pfel=20N=FCrnberg?= <=?iso-8859-1?Q?=C4pfel=20N=FCrnberg?= <aepfel@nuernberg.de>>
Subject: =?iso-8859-1?Q?Apfel_vs._=C4pfel?=

But when I instead search for the word with the UTF-8 character (in this case äpfel), it does NOT find the email:

imap_search($mbox_connection, 'FROM "äpfel"', SE_UID, "US-ASCII")

Due to this fact, I've changed back the charset from US-ASCII to UTF-8, but this only ends again at the error message [BADCHARSET (US-ASCII)].

My code is very simple:

$mailbox = "{outlook.office365.com:993/imap/ssl}INBOX";
$mailbox_username = "someone@outlook.com";
$mailbox_password = "*******";

echo "Trying to connect to '$mailbox'...\n";

$mbox_connection = imap_open($mailbox, $mailbox_username, $mailbox_password);

$mailsIds = imap_search($mbox_connection, 'SUBJECT "äpfel"', SE_UID, "UTF-8");

if(!$mailsIds) {
    echo "No emails found!\n";
    imap_close($mbox_connection);
    die();
}

echo "Found " . count($mailsIds) . " email(s)...\n";

foreach($mailsIds as $mailId) {
    echo "+------ P A R S I N G ------+\n";

    $headersRaw = imap_fetchheader($mbox_connection, $mailId, FT_UID);
    $header = imap_rfc822_parse_headers($headersRaw);

    echo "From: " . $header->from[0]->personal . " <" . $header->fromaddress . ">\n";
    echo "Subject: " . $header->subject . "\n";
}

I've already tried this solution, but this returns also no matching email:

$str = "äpfel";
$str = preg_replace('/\=\?ISO\-8859\-1\?Q\?/i', '', mb_encode_mimeheader($str, "ISO-8859-1", "Q"));
$mailsIds = imap_search($mbox_connection, 'SUBJECT "'.$str.'"', SE_UID, 'US-ASCII');

Any ideas, how I can search for non-ASCII characters in the email fields From, Subject and Body when the IMAP server does not support UTF-8 and I also can NOT change this on server-side configuration?

This seems to be an issue with all Microsoft Exchange servers. Only those servers have this issue as far as I could found it out via Google.


回答1:


You probably can't.

Exchange doesn't seem to implement charset aware searching for IMAP, and doing so is not a requirement of RFC3501 (only US-ASCII must be supported). UTF-8 is usually supported, but this does not seem to be the case for Exchange.

You would have to switch protocols (EAS, EWS, REST services, etc.) or pull down the information, decode it yourself, and search it. If you cache it, this isn't even too bad long term. Since it's headers, you can get this all in one fetch. If you need to search bodies, the case is much harder.



来源:https://stackoverflow.com/questions/55977063/php-imap-search-utf-8-non-ascii-characters-on-microsoft-exchange-mail-servers

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!