Java / Groovy : Find XML node by Line number

百般思念 提交于 2019-12-25 00:35:01

问题


Below is my groovy code to validate XML schema aginst an XSD

import java.io.File;
import java.io.IOException;

import javax.xml.XMLConstants;
import javax.xml.transform.stream.StreamSource;
import javax.xml.validation.Schema;
import javax.xml.validation.SchemaFactory;
import javax.xml.validation.Validator;
import javax.xml.transform.sax.SAXSource
import javax.xml.parsers.SAXParserFactory
import org.xml.sax.SAXException 
import org.xml.sax.InputSource
import org.xml.sax.SAXParseException
import org.xml.sax.ErrorHandler


def validateXMLSchema(String xsdPath, String xmlPath) {
    final List < SAXParseException > exceptions = new LinkedList < SAXParseException > ();
     try {
      SchemaFactory factory =
       SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
      Schema schema = factory.newSchema(new File(xsdPath));
      Validator validator = schema.newValidator();  
      validator.setErrorHandler(new ErrorHandler() {
       @Override
       public void warning(SAXParseException exception) throws SAXException {
        exceptions.add(exception);
       }

       @Override
       public void fatalError(SAXParseException exception) throws SAXException {
        exceptions.add(exception);
       }

       @Override
       public void error(SAXParseException exception) throws SAXException {
        exceptions.add(exception);
       }
      });
      def xmlFile = new File(xmlPath);
      validator.validate(new StreamSource(xmlFile));
      exceptions.each {
       println 'lineNumber : ' + it.lineNumber + '; message : ' + it.message
      }
     } catch (IOException | SAXException e) {
      println("Exception: line ${e.lineNumber} " + e.getMessage());
      return false;
     }
     return exceptions.size() == 0;
}

Below are some of the validation errors, I can access the line number for each message and am trying to find the corresponding node name

lineNumber : 106; message : cvc-datatype-valid.1.2.1: '' is not a valid value for 'date'. 
lineNumber : 248; message : cvc-enumeration-valid: Value 'Associate' is not facet-valid with respect to enumeration '[ADJSTR, ADJSMT]

Is there a simple way to find the Node name for corresponsding error message usinng the line number? Or do I have to read that specific line and parse it using XmlSlurper like below(trying to avoid this approach since it will be slower for larger XML files in production with heavy user load)?

def getNodeName(xmlFile, lineNumber){
  def xmlLine =  xmlFile.readLines().get(lineNumber)  
  def node = new XmlSlurper().parseText(xmlLine.toString())
  node.name()
}

回答1:


This is not elegant but the following getNodeName() should be faster (full example here):

def getNodeName(xmlFile, lineNumber) {
    def result = "unknown"
    def count = 1
    def NODE_REGEX = /.*?<(.*?)>.*/ 
    def br 

    try {
        br = new BufferedReader(new FileReader(xmlFile)) 
        String line
        def isDone = false
        while ((! isDone) && (line = br.readLine()) != null) {
            if (count == lineNumber) {
                def matcher = (line =~ NODE_REGEX) 
                if (matcher.matches()) {
                    result = matcher[0][1]
                }
                isDone = true
            }
            count++
        }
    } finally {
        // TODO: better exception handling
        br.close()
    }

    return result
}

It simply reads lines until the line in question and then uses a rudimentary regular expression to get the name. You could potentially use XmlSlurper as in your example if preferred. The key thing is that the file IO/memory should be considerably less.



来源:https://stackoverflow.com/questions/47701357/java-groovy-find-xml-node-by-line-number

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!