Why Base64 in Basic Authentication

前端 未结 2 1022
情歌与酒
情歌与酒 2020-12-06 19:18

why has the resulting string literal of \"username:password\" be encoded with Base64 in the Authorization header? Whats the background of it?

相关标签:
2条回答
  • 2020-12-06 19:24

    To understand the following, you should have a clear understanding of the differences between "character set" and "character encoding".

    Also, please keep in mind that Base64 is an encoding, and encoding is not encryption. Anything encoded in Base64 is intentionally easy to decode.

    The Base64 encoding, most importantly, ensures that the user:pass characters are all part of the ASCII character set and ASCII encoded. A user:pass in HTTP Basic auth is part of the Authorization header-field value. HTTP header values are ASCII (or Extended ASCII) encoded/decoded. So, when you Base64 encode a user:pass, you ensure that it is ASCII, and is therefore a valid header-field value.

    Base64 encoding also adds at least some kind of obfuscation to the clear-text user:pass. Again, this is not encryption. But, it does prevent normal humans from reading a user:pass at a glance. This seems almost meaningless from a security perspective, and I only include it because of the following background info.

    Some Background

    If you have a look at RFC 2616 (now obsolete) and RFC 2617, you'll see that they define both header field-values and Basic auth user:pass, respectively, as TEXT; i.e., ISO-8859-1 OCTECTs (ISO-8859-1 is an 8-bit Extended ASCII encoding). This is odd, because it makes it seem like the authors intended that a compliant user:pass should use the same character set/encoding as that required for HTTP headers, in which case the Base64 encoding seems pretty meaningless except for the trivial obfuscation.

    That said, it's hard to believe that the authors of those RFC's didn't think of usernames/passwords being in non-ASCII (non-ISO-8859-1) character sets. Assuming they had non-ASCII user:passes in mind, they might have been concerned about how to include/maintain/transmit non-ASCII bytes in the middle of an all ASCII set of headers. Base64 encoding the user:pass certainly solves that problem nicely. There's also the more canonical reason for using Base64 -- to make data transmission more reliable. My understanding is that HTTP is 8-bit clean; even though the headers are shipped as ASCII, I don't think the Base64 encoding of user:pass was to make its transmission more reliable.

    Without asking the original authors, I'm not sure we'll ever know for sure. Here's an interesting comment on the topic by Julian Reschke. He's the author of RFC 5987, Character Set and Language Encoding for Hypertext Transfer Protocol (HTTP) Header Field Parameters. He has also done a lot of work on HTTP RFCs, including the latest HTTP 1.1 RFC overhaul.

    The current HTTP 1.1 RFC which deals with HTTP header encoding, RFC 7230, now recommends USASCII (aka ASCII, 7-bit ASCII) for headers. RFC 5987 defines a header parameter encoding spec -- presumably some are using this. RFC 7235 is a recent update to RFC 2617 on HTTP Authentication.

    0 讨论(0)
  • 2020-12-06 19:40

    This is the production rule for the userid-password tuple before it’s encoded:

    userid-password   = [ token ] ":" *TEXT
    

    Here token is specified as follows:

       token          = 1*<any CHAR except CTLs or tspecials>
    

    This is basically any US-ASCII character within the range of 32 to 126 but without some special characters ((, ), <, >, @, ,, ;, :, \, ", /, [, ], ?, =, {, }, space, and horizontal tab).

    And TEXT is specified as follows:

       TEXT           = <any OCTET except CTLs,
                        but including LWS>
    

    This is basically any octet (0–255) sequence except control characters (codepoints 0–31, 127) but including linear whitespace sequences, which is one or more space or horizontal tab characters that may be preceded by a CRLF sequence:

       LWS            = [CRLF] 1*( SP | HT )
    

    Although this doesn’t break a header field value, LWS has the same semantics as a single space:

    All linear whitespace, including folding, has the same semantics as SP.

    And to keep such sequences as is, the string is encoded before it’s placed as field value.

    0 讨论(0)
提交回复
热议问题