I\'m trying to validade a XML against a W3C XML Schema.
The following code does the job and reports when error occurs. But I\'m unable to get line number of the error. I
I found this
http://www.herongyang.com/XML-Schema/Xerces2-XSD-Validation-with-XMLReader.html
that appears to provide the following details(to include line numbers)
Error:
Public ID: null
System ID: file:///D:/herong/dictionary_invalid_xsd.xml
Line number: 7
Column number: 22
Message: cvc-datatype-valid.1.2.1: 'yes' is not a valid 'boolean'
value.
using this code:
/**
* XMLReaderValidator.java
* Copyright (c) 2002 by Dr. Herong Yang. All rights reserved.
*/
import java.io.IOException;
import org.xml.sax.XMLReader;
import org.xml.sax.helpers.DefaultHandler;
import org.xml.sax.helpers.XMLReaderFactory;
import org.xml.sax.SAXException;
import org.xml.sax.SAXParseException;
class XMLReaderValidator {
public static void main(String[] args) {
String parserClass = "org.apache.xerces.parsers.SAXParser";
String validationFeature
= "http://xml.org/sax/features/validation";
String schemaFeature
= "http://apache.org/xml/features/validation/schema";
try {
String x = args[0];
XMLReader r = XMLReaderFactory.createXMLReader(parserClass);
r.setFeature(validationFeature,true);
r.setFeature(schemaFeature,true);
r.setErrorHandler(new MyErrorHandler());
r.parse(x);
} catch (SAXException e) {
System.out.println(e.toString());
} catch (IOException e) {
System.out.println(e.toString());
}
}
private static class MyErrorHandler extends DefaultHandler {
public void warning(SAXParseException e) throws SAXException {
System.out.println("Warning: ");
printInfo(e);
}
public void error(SAXParseException e) throws SAXException {
System.out.println("Error: ");
printInfo(e);
}
public void fatalError(SAXParseException e) throws SAXException {
System.out.println("Fattal error: ");
printInfo(e);
}
private void printInfo(SAXParseException e) {
System.out.println(" Public ID: "+e.getPublicId());
System.out.println(" System ID: "+e.getSystemId());
System.out.println(" Line number: "+e.getLineNumber());
System.out.println(" Column number: "+e.getColumnNumber());
System.out.println(" Message: "+e.getMessage());
}
}
}
Try using a SAXLocator http://download.oracle.com/javase/1.5.0/docs/api/org/xml/sax/Locator.html Parsers are not required to supply one, but if they do it should report line numbers
I think your code should include:
// this will be called when XML-parser starts reading
// XML-data; here we save reference to current position in XML:
public void setDocumentLocator(Locator locator) {
this.locator = locator;
}
(see http://www.java-tips.org/java-se-tips/org.xml.sax/using-xml-locator-to-indicate-current-parser-pos.html)
The parser will give you a locator which you can then use to get the line number. It's probably worth printing/debugging when this happens to see if you have a valid locator
Assuming the final objective is to have a validated DOM instance, the previous answers would require XML documents to be read twice — first for validation, and then again to build the object tree. That's fine if the document is given as a file path, but it would require some sort of workaround if it were provided as an input stream, which in principle can only be read once.
A more efficient alternative is to use a validating parser to check the XML document against the schema as the object tree is built. See the code below for how to setup a schema-validating DOM parser:
import java.io.File;
import java.io.FileInputStream;
import java.io.InputStream;
import javax.xml.XMLConstants;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.validation.Schema;
import javax.xml.validation.SchemaFactory;
import org.w3c.dom.Document;
import org.xml.sax.ErrorHandler;
import org.xml.sax.SAXException;
import org.xml.sax.SAXParseException;
public class XML {
public static Document load(String xml, String xsd) {
// The default error handler just prints errors to the standard error output. In
// order to make the parser interrupt its work once a validation error is found,
// we need to use a custom handler that throws an exception in response to any
// reported issues.
ErrorHandler errorHandler = new ErrorHandler() {
@Override
public void error(SAXParseException exception) throws SAXException {
throw exception;
}
@Override
public void fatalError(SAXParseException exception) throws SAXException {
throw exception;
}
@Override
public void warning(SAXParseException exception) throws SAXException {
throw exception;
}
};
try {
SchemaFactory factory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
Schema schema = factory.newSchema(new File(xsd));
DocumentBuilderFactory builderFactory = DocumentBuilderFactory.newInstance();
builderFactory.setNamespaceAware(true);
builderFactory.setSchema(schema);
DocumentBuilder builder = builderFactory.newDocumentBuilder();
builder.setErrorHandler(errorHandler);
InputStream input = new FileInputStream(xml);
Document document = builder.parse(input);
return document;
}
catch (SAXParseException e) {
int row = e.getLineNumber();
int col = e.getColumnNumber();
String message = e.getMessage();
System.out.println("Validation error at line " + row + ", column " + col + ": \"" + message + '"');
}
catch (Exception e) {
e.printStackTrace();
}
return null;
}
public static void main(String[] args) {
String xml = args[0];
String xsd = args[1];
Document document = load(xml, xsd);
boolean valid = (document != null);
System.out.println("Document \"" + xml + "\" is " + (valid ? "" : "not ") + "valid against schema \"" + xsd + '"');
}
}
Replace this line:
validator.validate(new DOMSource(document));
by
validator.validate(new StreamSource(new File("myxml.xml")));
will let the SAXParseException contain line number & column number