Java convert Windows-1252 to UTF-8, some letters are wrong

那年仲夏 提交于 2019-11-29 08:47:38

Obviously, textoFormado is a variable of type String. This means that the bytes were already decoded. Java then internally uses a 16-bit Unicode representation. What you did, is to encode your string with Windows-1252 followed by reading the resulting bytes with an UTF-8 encoding. That does not work.

What you need is the correct encoding when reading the bytes:

byte[] sourceBytes = getRawBytes();
String data = new String(sourceBytes , "Windows-1252");

For using this string inside your program, you do not need to do anything. Simply use it. If - however - you want to write the data back to a file for example, you need to encode again:

byte[] destinationBytes = data.getBytes("UTF-8");
// write bytes to destination file here

I solved it thanks to all.

I have the next project structure:

  • MyBatisQueries: I have a query with a "select" which gives me the String
  • Pojo to save the String (which gave me the String with conversion problems)
  • The class which uses the query and the Pojo object with data (that showed me bad decoded)

at first I had (MyBatis and Spring inject dependencies and params):

public class Pojo {
    private String params;
    public void setParams(String params) {
        try {
            this.params = params;
        }
    }

}

The solution:

public class Pojo {
    private String params;
    public void setParams(byte[] params) {
        try {
            this.params = new String(params, "UTF-8");
        } catch (UnsupportedEncodingException e) {
            this.params = null;
        }
    }

}
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!