Decode an UTF8 email header

前端 未结 5 1527
北海茫月
北海茫月 2020-12-15 04:40

I have an email subject of the form:

=?utf-8?B?T3.....?=

The body of the email is utf-8 base64 encoded - and has decoded fine. I am current

相关标签:
5条回答
  • 2020-12-15 05:09

    Check out RFC2047. The 'B' means that the part between the last two '?'s is base64-encoded. The 'utf-8' naturally means that the decoded data should be interpreted as UTF-8.

    0 讨论(0)
  • 2020-12-15 05:10

    MIME::Words from MIME-tools work well too for this. I ran into some issue with Encode and found MIME::Words succeeded on some strings where Encode did not.

    use MIME::Words qw(:all);
    $decoded = decode_mimewords(
        'To: =?ISO-8859-1?Q?Keld_J=F8rn_Simonsen?= <keld@dkuug.dk>',
    );
    
    0 讨论(0)
  • 2020-12-15 05:24

    This is a standard extension for charset labeling of headers, specified in RFC2047.

    0 讨论(0)
  • 2020-12-15 05:28

    I think that the Encode module handles that with the MIME-Header encoding, so try this:

    use Encode qw(decode);
    my $decoded = decode("MIME-Header", $encoded);
    
    0 讨论(0)
  • 2020-12-15 05:35

    The encoded-word tokens (as per RFC 2047) can occur in values of some headers. They are parsed as follows:

    =?<charset>?<encoding>?<data>?=
    

    Charset is UTF-8 in this case, the encoding is B which means base64 (the other option is Q which means Quoted Printable).

    To read it, first decode the base64, then treat it as UTF-8 characters.

    Also read the various Internet Mail RFCs for more detail, mainly RFC 2047.

    Since you are using Perl, Encode::MIME::Header could be of use:

    SYNOPSIS

    use Encode qw/encode decode/;
    $utf8   = decode('MIME-Header', $header);
    $header = encode('MIME-Header', $utf8);
    

    ABSTRACT

    This module implements RFC 2047 Mime Header Encoding. There are 3 variant encoding names; MIME-Header, MIME-B and MIME-Q. The difference is described below

                  decode()          encode()  
    MIME-Header   Both B and Q      =?UTF-8?B?....?=  
    MIME-B        B only; Q croaks  =?UTF-8?B?....?=  
    MIME-Q        Q only; B croaks  =?UTF-8?Q?....?=
    
    0 讨论(0)
提交回复
热议问题