“Content is not allowed in prolog” when parsing perfectly valid XML on GAE

后端 未结 13 2016
抹茶落季
抹茶落季 2020-11-27 15:20

I\'ve been beating my head against this absolutely infuriating bug for the last 48 hours, so I thought I\'d finally throw in the towel and try asking here before I throw my

相关标签:
13条回答
  • 2020-11-27 16:13

    Unexpected reason: # character in file path

    Due to some internal bug, the error Content is not allowed in prolog also appears if the file content itself is 100% correct but you are supplying the file name like C:\Data\#22\file.xml.

    This may possibly apply to other special characters, too.

    How to check: If you move your file into a path without special characters and the error disappears, then it was this issue.

    0 讨论(0)
  • 2020-11-27 16:14

    Removing the xml declaration solved it

    <?xml version='1.0' encoding='utf-8'?>
    
    0 讨论(0)
  • 2020-11-27 16:15

    bellow are cause above “org.xml.sax.SAXParseException: Content is not allowed in prolog” exception.

    1. First check the file path of schema.xsd and file.xml.
    2. The encoding in your XML and XSD (or DTD) should be same.
      XML file header: <?xml version='1.0' encoding='utf-8'?>
      XSD file header: <?xml version='1.0' encoding='utf-8'?>
    3. if anything comes before the XML document type declaration.i.e: hello<?xml version='1.0' encoding='utf-16'?>
    0 讨论(0)
  • 2020-11-27 16:16

    I had a tab character instead of spaces. Replacing the tab '\t' fixed the problem.

    Cut and paste the whole doc into an editor like Notepad++ and display all characters.

    0 讨论(0)
  • 2020-11-27 16:17

    The encoding in your XML and XSD (or DTD) are different.
    XML file header: <?xml version='1.0' encoding='utf-8'?>
    XSD file header: <?xml version='1.0' encoding='utf-16'?>

    Another possible scenario that causes this is when anything comes before the XML document type declaration. i.e you might have something like this in the buffer:

    helloworld<?xml version="1.0" encoding="utf-8"?>  
    

    or even a space or special character.

    There are some special characters called byte order markers that could be in the buffer. Before passing the buffer to the Parser do this...

    String xml = "<?xml ...";
    xml = xml.trim().replaceFirst("^([\\W]+)<","<");
    
    0 讨论(0)
  • 2020-11-27 16:17

    In my xml file, the header looked like this:

    <?xml version="1.0" encoding="utf-16"? />
    

    In a test file, I was reading the file bytes and decoding the data as UTF-8 (not realizing the header in this file was utf-16) to create a string.

    byte[] data = Files.readAllBytes(Paths.get(path));
    String dataString = new String(data, "UTF-8");
    

    When I tried to deserialize this string into an object, I was seeing the same error:

    javax.xml.stream.XMLStreamException: ParseError at [row,col]:[1,1]
    Message: Content is not allowed in prolog.
    

    When I updated the second line to

    String dataString = new String(data, "UTF-16");
    

    I was able to deserialize the object just fine. So as Romain had noted above, the encodings need to match.

    0 讨论(0)
提交回复
热议问题