RFC 6265 Sec 6.1 specifies allowing atleast 4096 bytes per cookie.
Now in order to know the number of characters allowed per cookie, I need to know
No matter how the cookies are stored internally by the browser, they eventually have to be transferred within the Set-Cookie
and Cookies
HTTP Header fields. It is the encoded length of those fields that the authors of the RFC most probably have in mind. At least in most RFCs that would be the case, so why not assume it here. Consequently, "the size of a cookie" depends on the way it will be encoded within an HTTP header.
According to the standard, request header fields should be
the OCTETs making up the field-value and consisting of either *TEXT or combinations of token, separators, and quoted-string
where *TEXT, in turn:
MAY contain characters from character sets other than ISO- 8859-1 [22] only when encoded according to the rules of RFC 2047.
RFC2047 defines what is known as "MIME encoding" and, as I read it, has some funny rules. Namely, according to its rules in order to encode a foreign charset you will either have to use a "quoted-printable" format: =?UTF-8?Q?=48=65=6c=6c=6f?=
, or a "Base64 format: =?UTF-8?B?SGVsbG8=?=
. (Note that both examples here encode the word "Hello". The first uses 27 bytes, the second uses 20, however this does not include the cookie name and attributes).
Moreover, according to RFC2047 you may not have "encoded words" longer than 76 characters, hence, if I understand things correctly, your longer cookie values will have to be encoded as a bunch of 76-byte pieces, each piece starting with the =?UTF-8?Q?=
mumbo-jumbo.
I tested what would happen if I set a non-ASCII (Russian language) cookie using PHP via Apache. The resulting Set-Cookie
header had no charset specification, used URL-encoding and was longer than 76 bytes (so much for the standards, right?):
CookieName=%D0%92+%D0...%B0%D0%B9; expires=Thu, 11-Sep-2014 19:59:18 GMT; path=/tmp/; domain=.some.domain.
The total length of a cookie value (with attributes), corresponding to an otherwise 176-character sentence was 923 bytes.
To summarize, I don't think you can get a strict answer to your question, but it's a fun question none the less.
It seems it is determined more by the programmer (behind the browser) than by the programming language. Usually cookies values are URL-encoded but there is no requirement.
Have a look at this answer that complete your study (adding the Safari special case). This one might help too.