问题
Below is my groovy code to validate XML schema aginst an XSD
import java.io.File;
import java.io.IOException;
import javax.xml.XMLConstants;
import javax.xml.transform.stream.StreamSource;
import javax.xml.validation.Schema;
import javax.xml.validation.SchemaFactory;
import javax.xml.validation.Validator;
import javax.xml.transform.sax.SAXSource
import javax.xml.parsers.SAXParserFactory
import org.xml.sax.SAXException
import org.xml.sax.InputSource
import org.xml.sax.SAXParseException
import org.xml.sax.ErrorHandler
def validateXMLSchema(String xsdPath, String xmlPath) {
final List < SAXParseException > exceptions = new LinkedList < SAXParseException > ();
try {
SchemaFactory factory =
SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
Schema schema = factory.newSchema(new File(xsdPath));
Validator validator = schema.newValidator();
validator.setErrorHandler(new ErrorHandler() {
@Override
public void warning(SAXParseException exception) throws SAXException {
exceptions.add(exception);
}
@Override
public void fatalError(SAXParseException exception) throws SAXException {
exceptions.add(exception);
}
@Override
public void error(SAXParseException exception) throws SAXException {
exceptions.add(exception);
}
});
def xmlFile = new File(xmlPath);
validator.validate(new StreamSource(xmlFile));
exceptions.each {
println 'lineNumber : ' + it.lineNumber + '; message : ' + it.message
}
} catch (IOException | SAXException e) {
println("Exception: line ${e.lineNumber} " + e.getMessage());
return false;
}
return exceptions.size() == 0;
}
Below are some of the validation errors, I can access the line number for each message and am trying to find the corresponding node name
lineNumber : 106; message : cvc-datatype-valid.1.2.1: '' is not a valid value for 'date'.
lineNumber : 248; message : cvc-enumeration-valid: Value 'Associate' is not facet-valid with respect to enumeration '[ADJSTR, ADJSMT]
Is there a simple way to find the Node name for corresponsding error message usinng the line number? Or do I have to read that specific line and parse it using XmlSlurper like below(trying to avoid this approach since it will be slower for larger XML files in production with heavy user load)?
def getNodeName(xmlFile, lineNumber){
def xmlLine = xmlFile.readLines().get(lineNumber)
def node = new XmlSlurper().parseText(xmlLine.toString())
node.name()
}
回答1:
This is not elegant but the following getNodeName()
should be faster (full example here):
def getNodeName(xmlFile, lineNumber) {
def result = "unknown"
def count = 1
def NODE_REGEX = /.*?<(.*?)>.*/
def br
try {
br = new BufferedReader(new FileReader(xmlFile))
String line
def isDone = false
while ((! isDone) && (line = br.readLine()) != null) {
if (count == lineNumber) {
def matcher = (line =~ NODE_REGEX)
if (matcher.matches()) {
result = matcher[0][1]
}
isDone = true
}
count++
}
} finally {
// TODO: better exception handling
br.close()
}
return result
}
It simply reads lines until the line in question and then uses a rudimentary regular expression to get the name. You could potentially use XmlSlurper
as in your example if preferred. The key thing is that the file IO/memory should be considerably less.
来源:https://stackoverflow.com/questions/47701357/java-groovy-find-xml-node-by-line-number