How to use saxon built-in catalog feature

前端 未结 2 1230
你的背包
你的背包 2021-01-14 00:18

I downloaded SaxonHE9-4-0-6J and want to process XHTML on CLI. However Saxon tries to load DTD from W3C and it takes too much time for every simple command.

I have x

2条回答
  •  太阳男子
    2021-01-14 00:56

    From the saxonica link in your question:

    When the -catalog option is used on the command line, this overrides the internal resolver used in Saxon (from 9.4) to redirect well-known W3C references (such as the XHTML DTD) to Saxon's local copies of these resources. Because both these features rely on setting the XML parser's EntityResolver, it is not possible to use them in conjunction.

    This sounds to me like Saxon automatically uses local copies of the well-known W3C DTDs, but if you specify -catalog, it does not use the internal resolver and you have to specify these explicitly in your catalog.


    Here's a working example of using a catalog with Saxon...

    File/directory structure of my example

    C:/so_test/lib
    C:/so_test/lib/catalog.xml
    C:/so_test/lib/resolver.jar
    C:/so_test/lib/saxon9he.jar
    C:/so_test/lib/test.dtd
    C:/so_test/test.xml
    

    XML DTD (so_test/lib/test.dtd)

    
    
    

    XML Instance (so_test/test.xml)

    Note that the system identifier points to a location that doesn't exist to make sure the catalog is being used.

    
    
        Success!
    
    

    XML Catalog (so_test/lib/catalog.xml)

    
        
            
        
    
    

    Command Line

    Note the -dtd option to enable validation.

    C:\so_test>java -cp lib/saxon9he.jar;lib/resolver.jar net.sf.saxon.Query -s:"test.xml" -qs:"{data(/test/foo)}" -catalog:"lib/catalog.xml" -dtd
    

    Results

    Success!
    

    If I make the XML instance invalid:

    
    
        
        Success!
    
    

    and run the same command line as above, here is the result:

    Recoverable error on line 4 column 6 of test.xml:
      SXXP0003: Error reported by XML parser: Element type "x" must be declared.
    Recoverable error on line 6 column 8 of test.xml:
      SXXP0003: Error reported by XML parser: The content of element type "test" must match "(foo)".
    Query processing failed: The XML parser reported two validation errors
    

    Hopefully this example will help you figure out what to change with your setup.

    Also, using the -t option gives you additional information such as what catalog was loaded and if the public identifier was resolved:

    Loading catalog: file:///C:/so_test/lib/catalog.xml
    Saxon-HE 9.4.0.6J from Saxonica
    Java version 1.6.0_35
    Analyzing query from {{data(/test/foo)}}
    Analysis time: 122.70132 milliseconds
    Processing file:/C:/so_test/test.xml
    Using parser org.apache.xml.resolver.tools.ResolvingXMLReader
    Building tree for file:/C:/so_test/test.xml using class net.sf.saxon.tree.tiny.TinyBuilder
    Resolved public: -//TEST//Dan Test//EN
            file:/C:/so_test/lib/test.dtd
    Tree built in 0 milliseconds
    Tree size: 5 nodes, 8 characters, 0 attributes
    Success!Execution time: 19.482079ms
    Memory used: 20648808
    

    Additional Information

    Saxon distributes the Apache version of Xerces, so use the resolver.jar found in the Apache Xerces distribution.

提交回复
热议问题