For HTTP responses with Content-Types suggesting character data, which charset should be assumed by the client if none is specified?

后端未结

关注

 6  1950

说谎 2021-02-07 23:08

If no charset parameter is specified in the Content-Type header, RFC2616 section 3.7.1 seems to imply ISO8859-1 should be assumed for media types of subtype "text":

6条回答

盖世英雄少女心 (楼主)

2021-02-07 23:22

All major browsers I've checked (IE, FF and Opera) completely ignore the RFC specification in this part.

If you are interested in the algorithm to auto-detect charset by data, look at Mozilla Firefox link.

Just a small note about content types: Only text has character sets. It's reasonable to assume that browsers handle application/x-javascript the same as they handle text/javascript ( except IE6, but that's another subject ).

Internet Explorer will use the default charset (probably stored at registry), as noted:

By default, Internet Explorer uses the character set specified in the HTTP content type returned by the server to determine this translation. If this parameter is not given, Internet Explorer uses the character set specified by the meta element in the document. It uses the user's preferences if no meta element is specified.

Source: http://msdn.microsoft.com/en-us/library/ms537500%28VS.85%29.aspx

Mozilla Firefox attempts to auto-detect the charset, as pointed here:

This paper presents three types of auto-detection methods to determine encodings of documents without explicit charset declaration.

Source: http://www.mozilla.org/projects/intl/UniversalCharsetDetection.html

Opera uses auto-detection too, as documented:

If the transport protocol provides an encoding name, that is used. If not, Opera will look at the page for a charset declaration. If this is missing, Opera will attempt to auto-detect the encoding, using the domain name to see if the script is a CJK script, and if so which one. Opera can also auto-detect UTF-8.

Source: http://www.opera.com/docs/specs/opera9/

0 讨论(0)

查看其它6个回答
发布评论:

提交评论
- 加载中...