问题
A coworker of mine created a basic contact-us type form, which is mangling accented characters (è, é, à, etc). We're using KonaKart a Java e-commerce platform on Struts 1.
I've narrowed the issue down to the data coming in through the HttpServletRequest object. Comparing a similar (properly functioning) form, I noticed that on the old form the request object's Character Encoding (request.getCharacterEncoding()
) is returned as "UTF-8", but on the new form it is coming back as NULL, and the text coming out of request.getParameter()
is already mangled.
Aside from that, I haven't found any significant differences between the known-good form, and the new-and-broken form.
Things I've ruled out:
- Both HTML pages have the tag:
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
- Both form tags in the HTML use POST, and do not set encodings
- Checking from Firebug, both the Request and Response headers have the same properties
- Both JSP pages use the same attributes in the
<%@page contentType="text/html;charset=UTF-8" language="java" %>
tag - There's nothing remotely interesting going on in the *Form.java files, both inherit from BaseValidatorForm
- I've checked the source file encodings, they're all set to Default - inherited from Container: UTF-8
If I convert them from ISO-8859-1 to UTF-8, it works great, but I would much rather figure out the core issue.
eg: new String(request.getParameter("firstName").getBytes("ISO-8859-1"),"UTF8")
Any suggestions are welcome, I'm all out of ideas.
回答1:
Modern browsers usually don't supply the character encoding in the HTTP request Content-Type
header. It's in case of HTML form based applications however the same character encoding as specified in the Content-Type
header of the initial HTTP response serving the page with the form. You need to explicitly set the request character encoding to the same encoding yourself, which is in your case thus UTF-8.
request.setCharacterEncoding("UTF-8");
Do this before any request parameter is been retrieved from the request (otherwise it's too late; the server platform default encoding would then be used to parse the parameters, which is indeed often ISO-8859-1). A servlet filter which is mapped on /*
is a perfect place for this.
See also:
- Unicode - How to get the characters right?
回答2:
The request.getCharacterEncoding()
relies on the Content-Type
request attribute, not Accept-Charset
So application/x-www-form-urlencoded;charset=IS08859_1
should work for the POST action. The <%@page
tag doesn't affect the POST data.
来源:https://stackoverflow.com/questions/12358101/request-getcharacterencoding-returns-null-why