问题
I am validating an in-memory DOM object using the javax.xml.validation.Validator
class against an XSD schema. I am getting a SAXParseException
being thrown during the validation whenever there is some data corruption in the information I populate my DOM from.
An example error:
org.xml.SAXParseException: cvc-datatype-valid.1.2.1: '???"??[?????G?>???p~tn??~0?1]' is not a valid valud for 'hexBinary'.
What I am hoping is that there is a way to find the location of this error in my in-memory DOM and print out the offending element and its parent element. My current code is:
public void writeDocumentToFile(Document document) throws XMLWriteException {
try {
// Validate the document against the schema
Validator validator = getSchema(xmlSchema).newValidator();
validator.validate(new DOMSource(document));
// Serialisation logic here.
} catch(SAXException e) {
throw new XMLWriteException(e); // This is being thrown
} // Some other exceptions caught here.
}
private Schema getSchema(URL schema) throws SAXException {
SchemaFactory schemaFactory =
SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
// Some logic here to specify a ResourceResolver
return schemaFactory.newSchema(schema);
}
I have looked into the Validator#setErrorHandler(ErrorHandler handler)
method but the ErrorHandler
interface only gives me exposure to a SAXParseException
which only exposes the line number and column number of the error. Because I am using an in-memory DOM this returns -1 for both line and column number.
Is there a better way to do this? I don't really want to have to manually validate the Strings before I add them to the DOM if the libraries provide me the function I'm looking for.
I'm using JDK 6 update 26 and JDK 6 update 7 depending on where this code is running.
EDIT: With this code added -
validator.setErrorHandler(new ErrorHandler() {
@Override
public void warning(SAXParseException exception) throws SAXException {
printException(exception);
throw exception;
}
@Override
public void error(SAXParseException exception) throws SAXException {
printException(exception);
throw exception;
}
@Override
public void fatalError(SAXParseException exception) throws SAXException {
printException(exception);
throw exception;
}
private void printException(SAXParseException exception) {
System.out.println("exception.getPublicId() = " + exception.getPublicId());
System.out.println("exception.getSystemId() = " + exception.getSystemId());
System.out.println("exception.getColumnNumber() = " + exception.getColumnNumber());
System.out.println("exception.getLineNumber() = " + exception.getLineNumber());
}
});
I get the output:
exception.getPublicId() = null
exception.getSystemId() = null
exception.getColumnNumber() = -1
exception.getLineNumber() = -1
回答1:
If you are using Xerces (the Sun JDK default), you can get the element that failed validation through the http://apache.org/xml/properties/dom/current-element-node property:
...
catch (SAXParseException e)
{
Element curElement = (Element)validator.getProperty("http://apache.org/xml/properties/dom/current-element-node");
System.out.println("Validation error: " + e.getMessage());
System.out.println("Element: " + curElement);
}
Example:
String xml = "<root xmlns=\"http://www.myschema.org\">\n" +
"<text>This is text</text>\n" +
"<number>32</number>\n" +
"<number>abc</number>\n" +
"</root>";
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setNamespaceAware(true);
Document doc = dbf.newDocumentBuilder().parse(new ByteArrayInputStream(xml.getBytes("UTF-8")));
Schema schema = getSchema(getClass().getResource("myschema.xsd"));
Validator validator = schema.newValidator();
try
{
validator.validate(new DOMSource(doc));
}
catch (SAXParseException e)
{
Element curElement = (Element)validator.getProperty("http://apache.org/xml/properties/dom/current-element-node");
System.out.println("Validation error: " + e.getMessage());
System.out.println(curElement.getLocalName() + ": " + curElement.getTextContent());
//Use curElement.getParentNode() or whatever you need here
}
If you need to get line/column numbers from the DOM, this answer has a solution to that problem.
回答2:
SaxParseException exposes the SystemId and PublicId. Does that not give you enough information?
来源:https://stackoverflow.com/questions/8077437/how-can-i-get-more-information-on-an-invalid-dom-element-through-the-validator