问题
I am using accept-charset="utf-8" attribute in form and found that the when do a form post with non-ascii, the headers have different accept charset option in the request header. Is there anything i am missing ? My form looks like this
<form method="post" action="controller" accept-charset="UTF-8">
..input text box
.. submit button
</form>
Thanks in advance
回答1:
The question, as asked, is self-contradictory: the heading says that the accept-charset
parameter does not do anything, whereas the question body says that when the accept-charset
attribute (this is the correct term) is used, “the headers have different accept charset option in the request header”. I suppose a negation is missing from the latter statement.
Browsers send Accept-Charset
parameters in HTTP request headers according to their own principles and settings. For example, my Chrome sends Accept-Charset:windows-1252,utf-8;q=0.7,*;q=0.3
. Such a header is typically ignored by server-side software, but it could be used (and it was designed to be used) to determine which encoding is to be used in the server response, in case the server-side software (a form handler, in this case) is capable of using different encodings in the response.
The accept-charset
attribute in a form
element is not expected to affect HTTP request headers, and it does not. It is meant to specify the character encoding to be used for the form data in the request, and this is what it actually does. The HTML 4.01 spec is obscure about this, but the W3C HTML5 draft puts it much better, though for some odd reason uses plural: “gives the character encodings that are to be used for the submission”. I suppose the reason is that you could specify alternate encodings, to prepare for situations where a browser is unable to use your preferred encoding. And what actually happens in Chrome for example is that if you use accept-charset="foobar utt-8"
, then UTF-8 used.
In practice, the attribute is used to make the encoding of data submission different from the encoding of the page containing the form. Suppose your page is ISO-8859-1 encoded and someone types Greek or Hebrew letters into your form. Browsers will have to do some error recovery, since those characters cannot be represented in ISO-8859-1. (In practice they turn the characters to numeric character references, which is logically all wrong but pragmatically perhaps the best they can do.) Using <form charset=utf-8>
helps here: no matter what the encoding is, the form data will be sent as UTF-8 encoding, which can handle any character.
If you wish to tell the form handler which encoding it should use in its response, then you can add a hidden (or non-hidden) field into the form for that.
来源:https://stackoverflow.com/questions/12830546/accept-charset-utf-8-parameter-doesnt-do-anything-when-used-in-form