问题
I am hitting my servlet URL from external source. One of the parameter is having Hindi text. The external source is encoding it. The encoded value is.
%E0%A4%AA%E0%A4%BE%E0%A4%A0%E0%A5%8D%E0%A4%AF%20%E0%A4%AD%E0%A4%BE%E0%A4%97
I can see it in TCP dump via wireshark. But I am not getting this encoded string in servlet application. I am trying to get it via getParameter() method. It's returning some random characters.
Since I am not getting correct value, so if I try to decode it in my servlet class with the use of
URLDecoder.decode(myString, "UTF-8");
Then it's returning some random characters, like this -
विषय वस�त�
Please suggest me how to read in servlet this encoded text and decode back to original value.
回答1:
I am trying to get it via getParameter() method.
getParameter
and handling of input encodings in Servlet is broken in general. You get ISO-8559-1 whether you want it or not (and you generally don't).
You can work around this and get UTF-8 for query string parameters by:
Container-specific configuration options (eg Tomcat
URIEncoding
).Grabbing the raw
request.getQueryString()
and passing its pieces intoURLDecoder.decode(..., "utf-8")
manually instead of relying ongetParameter
. Only if you are taking this route do you need to worry aboutURLDecoder
yourself.Fixing up the mis-decoding of the
getParameter
output by encoding the bad value back to the original bytes it came from (using ISO-8859-1) and then decoding it as UTF-8, egnew String(request.getParameter("param").getBytes("iso-8859-1"), "utf-8")
.
See this question for background.
回答2:
I've tried this:
try {
System.out.println(URLDecoder.decode("%E0%A4%AA%E0%A4%BE%E0%A4%A0%E0%A5%8D%E0%A4%AF%20%E0%A4%AD%E0%A4%BE%E0%A4%97", "UTF-8"));
}
catch (Exception e) {
e.printStackTrace();
}
... and it works for me, Hindi characters, no exception thrown.
Make sure your console is outputting in UTF-8, it's probably in a different encoding.
Edit
In Eclipse:
Run
Run Configurations...
"Commmon" tab
Encoding
[select UTF-8]
Edit II
Example code in the processRequest of your HttpServlet class:
response.setContentType("text/html;charset=UTF-8");
String argument = request.getParameter("argument");
String decoded;
if (argument != null) {
decoded = URLDecoder.decode(argument, "UTF-8");
}
else {
decoded = "null";
}
PrintWriter out = response.getWriter();
try {
out.println("<!DOCTYPE html>");
out.println("<html>");
out.println("<head>");
out.println("<title>Servlet TestServlet</title>");
out.println("</head>");
out.println("<body>");
out.println("<h1>The argument's value is: " + decoded + "</h1>");
out.println("</body>");
out.println("</html>");
} finally {
out.close();
}
Output:
来源:https://stackoverflow.com/questions/17212353/how-to-process-encoded-unicode-text-in-servlet