Chinese character encoding?

雨燕双飞 提交于 2019-12-10 15:39:51

问题


I have a use case where i am submitting parameters to spring controller through post request. In the controller, i am reading parameters and performing some actions. After that i am sending those parameters as part of request params to other URL.

Here i am not able to deal with chinese characters. It is getting garbled.

Actions i am doing now : 1) I am passing below Chinese text as param with name subject from HTML page(this is not JSP). 以下便是有关此问题的所有信息

2) When i read this value from request in controller, it is coming as : 以ä¸ä¾¿æ¯æå³æ­¤é®é¢çææä¿¡æ¯

3) I am not able to get the exact value that is submitted from page.

It looks like it is already encoded when i verify the encoded text at below url : http://coderstoolbox.net/string/#!encoding=none&action=encode&charset=utf_8 http://www.cafewebmaster.com/online_tools/utf_decode

4) Now i want to pass the actual user submitted string to other URL as response.sendRedirect. I tried decoding the URL to see if i can get the actual string but no success.

I am using tomcat server. I have defined UTF-8 encoding in server.xml and added a URLEncodingFilter in web.xml as first filter mapping. This filter do the request.setEncoding to UTF-8.

Still i am not able to track where things are going wrong. Can someone suggest me how to get back the actual string in controller?

Also have below filter in my web.xml

<filter>
            <filter-name>EncodingFilter</filter-name>
            <filter-class>org.springframework.web.filter.CharacterEncodingFilter</filter-class>
            <init-param>
                <param-name>encoding</param-name>
                <param-value>UTF-8</param-value>
            </init-param>
            <init-param>
                <param-name>forceEncoding</param-name>
                <param-value>true</param-value>
            </init-param>
        </filter>

Let me know if you need any information to get more context.


回答1:


If you are using, please change the Connector in the server.xml file as below

<Connector connectionTimeout="20000" port="8080" protocol="HTTP/1.1"
    redirectPort="8443" useBodyEncodingForURI="true">
</Connector>

Hope this solves your problem.

Regards, Kishore




回答2:


try to add this filter to your web.xml:

<filter>
    <filter-name>characterEncodingFilter</filter-name>
    <filter-class>org.springframework.web.filter.CharacterEncodingFilter</filter-class>
    <init-param>
        <param-name>encoding</param-name>
        <param-value>UTF-8</param-value>
    </init-param>
    <init-param>
        <param-name>forceEncoding</param-name>
        <param-value>true</param-value>
    </init-param>
</filter>

and map it:

<filter-mapping>
    <filter-name>characterEncodingFilter</filter-name>
    <url-pattern>/*</url-pattern>
</filter-mapping>

i had a similar problem and this solved it.




回答3:


After doing decoding in the below, i am able to retrieve the actual string. I am still investigating why i need to do Latin decoding. I will update, once i get full understanding of the problem. If any of you know the reason of latin encoding, please let me know.

public String getncodedSubject(String text) {
        if (text == null || text.isEmpty()) {
                return "";
        }
        try {
            byte[] encoding1 = subject.getBytes("UTF-8");
            String string1 = new String(encoding1, 0, encoding1.length); // Default encoding of my platform is UTF-8
            byte[] encoding2 = string1.getBytes("ISO8859-1");//ISO-8859-1 (ISO Latin 1) Character Encoding
            char[] hexaChars =  Hex.encodeHex(encoding2);
            StringBuilder str = new StringBuilder();
            for(int i=0;i<hexaChars.length;i = i+2){
                str.append("%");
                str.append(hexaChars[i]);
                str.append(hexaChars[i+1]);
            }            
            return str.toString();
        } catch (UnsupportedEncodingException e) {
            System.out.println(e);
        }
        return "";
    }

After digging more, it seems it is getting latin encoded string :

import java.nio.charset.CharsetDecoder;  
import java.nio.charset.Charset;
import java.util.Arrays; 

public class Main {
    public static void main(String[] args) throws Exception {
        byte[] encoding1 = "以ä¸ä¾¿æ¯æå³æ­¤é®é¢çææä¿¡æ¯".getBytes("ISO8859-1");

        for (byte b : encoding1) {
            System.out.printf("%x ",b);
        }  
    }
}

Still i am not sure how it is getting latin encoded string... any suggestions? I checked in my server.xml also




回答4:


Thanks to every one for your responses. After doing more investigation, below are my observations.

I am rendering my page using Mason(Perl + HTML) not using JSP. So i was not able to specify encoding type in page to force browser to submit UTF-8 encoded string.

Now i am programmatically decoding with "ISO8859-1"(Latin) and encoding with UTF-8 to get actual string for consumption.

Please let me know of there is a way to specify encoding type in Mason(Perl + HTML) so that it will submit parameters with UTF-8 encoding instead of taking default encoding.

import java.nio.charset.CharsetDecoder;  
import java.nio.charset.Charset;
import java.util.Arrays; 

public class Main {
    public static void main(String[] args) throws Exception {
        byte[] encoding1 = "ä»¥ä¸‹ä¾¿æ˜¯æœ‰å…³æ­¤é—®é¢˜çš„æ‰€æœ‰ä¿¡æ ¯".getBytes("ISO8859-1");                
        String s = new String(encoding1, "UTF-8");
        System.out.println(s);
    }
}


来源:https://stackoverflow.com/questions/18211612/chinese-character-encoding

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!