What does the message “Invalid byte 2 of a 3-byte UTF-8 sequence” mean?

前端 未结 9 620
半阙折子戏
半阙折子戏 2021-01-04 02:41

I changed a file in Orbeon Forms, and the next time I load the page, I get an error message saying Invalid byte 2 of a 3-byte UTF-8 sequence. How can I solve this p

相关标签:
9条回答
  • 2021-01-04 03:00

    Had same problem.

    Problem > I'm getting X509 certificate values (multiple encoding source) to generate a PDF report. The PDF is generated throught a webservice that waits for an UTF-8 xml request and I've to reencode the values before marshalling.

    Solution > http://fabioangelini.wordpress.com/2011/08/04/converting-java-string-fromto-utf-8/

    Using this class:

    public class StringHelper {
    
    // convert from UTF-8 -> internal Java String format
    public static String convertFromUTF8(String s) {
        String out = null;
        try {
            out = new String(s.getBytes("ISO-8859-1"), "UTF-8");
        } catch (java.io.UnsupportedEncodingException e) {
            return null;
        }
        return out;
    }
    
    // convert from internal Java String format -> UTF-8
    public static String convertToUTF8(String s) {
        String out = null;
        try {
            out = new String(s.getBytes("UTF-8"), "ISO-8859-1");
        } catch (java.io.UnsupportedEncodingException e) {
            return null;
        }
        return out;
    }
    }
    

    Usage:

    //getSummaryAttMap() returns a HashMap
    String value = (String) getSummaryAttMap().get(key);
    if(value != null)
    value = StringHelper.convertToUTF8(value);
    else
    value = "";
    
    0 讨论(0)
  • 2021-01-04 03:01

    When you start your program, use the following Java command line argument:

    -Dfile.encoding=UTF-8
    

    For example,

    java -Dfile.encoding=UTF-8 -jar foo.jar
    
    0 讨论(0)
  • 2021-01-04 03:01

    I'll provide a special coding answer. When you check the xml file and there's nothing wrong, and you're using Java and running Tomcat Server. Your source code may neglect specify the encoding yourself, and thus JVM uses default encoding when read in xml contents as string or something else that repesents string, which in turn refer to Tomcat's default encoding. If encoding of xml and Tomcat are inconsistent, it might also report same error message.

    0 讨论(0)
  • 2021-01-04 03:03

    You might need to configure your Tomcat with the following parameter:

    -Dfile.encoding=UTF-8

    0 讨论(0)
  • 2021-01-04 03:06

    This happens when Orbeon Forms reads an XML file and expects it to use the UTF-8 encoding, but somehow the file isn't properly encoded in UTF-8. To solve this, make sure that:

    1. You have an XML declaration at the beginning of the file saying the file is in UTF-8:

      <?xml version="1.0" encoding="UTF-8" ?>
      
    2. Your editor is XML-aware, so it can parse the XML declaration and consequently use the UTF-8 encoding. If your editor isn't XML aware, and you don't want to use another editor, look for an option or preference allowing you to specify that the editor must use UTF-8.

    0 讨论(0)
  • 2021-01-04 03:07

    The switching of the encoding for the input might help:

    XMLEventReader eventReader =
                                inputFactory.createXMLEventReader(in, 
                                        "utf-8"
                                        //"windows-1251"
                                );
    
    0 讨论(0)
提交回复
热议问题