request.getCharacterEncoding() returns NULL… why?

|▌冷眼眸甩不掉的悲伤 提交于 2019-12-18 13:29:37

问题


A coworker of mine created a basic contact-us type form, which is mangling accented characters (è, é, à, etc). We're using KonaKart a Java e-commerce platform on Struts 1.

I've narrowed the issue down to the data coming in through the HttpServletRequest object. Comparing a similar (properly functioning) form, I noticed that on the old form the request object's Character Encoding (request.getCharacterEncoding()) is returned as "UTF-8", but on the new form it is coming back as NULL, and the text coming out of request.getParameter() is already mangled.

Aside from that, I haven't found any significant differences between the known-good form, and the new-and-broken form.

Things I've ruled out:

  • Both HTML pages have the tag: <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
  • Both form tags in the HTML use POST, and do not set encodings
  • Checking from Firebug, both the Request and Response headers have the same properties
  • Both JSP pages use the same attributes in the <%@page contentType="text/html;charset=UTF-8" language="java" %> tag
  • There's nothing remotely interesting going on in the *Form.java files, both inherit from BaseValidatorForm
  • I've checked the source file encodings, they're all set to Default - inherited from Container: UTF-8

If I convert them from ISO-8859-1 to UTF-8, it works great, but I would much rather figure out the core issue. eg: new String(request.getParameter("firstName").getBytes("ISO-8859-1"),"UTF8")

Any suggestions are welcome, I'm all out of ideas.


回答1:


Modern browsers usually don't supply the character encoding in the HTTP request Content-Type header. It's in case of HTML form based applications however the same character encoding as specified in the Content-Type header of the initial HTTP response serving the page with the form. You need to explicitly set the request character encoding to the same encoding yourself, which is in your case thus UTF-8.

request.setCharacterEncoding("UTF-8");

Do this before any request parameter is been retrieved from the request (otherwise it's too late; the server platform default encoding would then be used to parse the parameters, which is indeed often ISO-8859-1). A servlet filter which is mapped on /* is a perfect place for this.

See also:

  • Unicode - How to get the characters right?



回答2:


The request.getCharacterEncoding() relies on the Content-Type request attribute, not Accept-Charset

So application/x-www-form-urlencoded;charset=IS08859_1 should work for the POST action. The <%@page tag doesn't affect the POST data.



来源:https://stackoverflow.com/questions/12358101/request-getcharacterencoding-returns-null-why

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!