问题
I process XML files using SAX :
XMLReader reader = XMLReaderFactory.createXMLReader("org.apache.xerces.parsers.SAXParser");
reader.setFeature("http://xml.org/sax/features/validation", Boolean.TRUE);
reader.setFeature("http://apache.org/xml/features/validation/schema", Boolean.TRUE);
I load a grammar (XSD) and set it to the reader.
reader.setProperty("http://apache.org/xml/properties/internal/grammar-pool", grammarPool);
The grammar contains defaultValue for some optional attribute of some element. Those attributes having default value are passed with this value to the startElement(ContentHandler#startElement)) method from my handler even if they are not present in the source XML. Could I somehow verify whether the attribute is present in XML ?
回答1:
A SAX2 parser that reports the feature flag http://xml.org/sax/features/use-attributes2
as true
will provide instances of the Attributes2 interface to the Attributes argument of ContentHandler#startElement(String uri, String localName, String qName, Attributes atts)
. This extended interface provides, among others, the methods isSpecified(int index)
and isSpecified(String qName)
, which test true
if the attribute was specified in the document, or false
if the value is defaulted through the DTD or schema.
An use case I found was to transduce from XHTML 4.01 Transitional to (X)HTML5 in an EPUB 3.0.1 pipeline. One snag I hit was that the Transitional DTD has the <br>
element default the attribute clear
to none
; this attribute is invalid in HTML5. In order to avoid manually filtering all invalidated attributes in HTML5, I rebuild an Attributes
by filtering defaulted attributes as follows:
public static Attributes filterDefaults(Attributes attributes) {
// Either test the feature flag, or do an instance test
Attributes2 attrs = (Attributes2) attributes;
Attributes2Impl newAttrs = new Attributes2Impl();
for (int i = 0; i < attrs.getLength(); i++) {
if (attrs.isSpecified(i)) {
final String qName = attrs.getQName(i);
final String type = attrs.getType(i);
final String value = attrs.getValue(i);
newAttrs.addAttribute(null, null, qName, type, value);
}
}
return newAttrs;
}
The XMLReader
should be set up to validate both the DTD or schema and the input XML, more or less as follows:
/**
* @see "https://xerces.apache.org/xerces2-j/features.html"
*/
private XMLReader buildParser(SAXParserFactory parserFactory) throws SAXException {
try {
final SAXParser parser = parserFactory.newSAXParser();
final XMLReader reader = parser.getXMLReader();
if (!reader.getFeature("http://xml.org/sax/features/use-attributes2"))
throw new SAXException("SAX2 parser with Attributes2 required");
// Set your callback instances here
reader.setEntityResolver(/*...*/);
reader.setErrorHandler(/*...*/);
reader.setContentHandler(/*...*/);
reader.setProperty("http://xml.org/sax/properties/lexical-handler", /*...*/);
reader.setFeature("http://xml.org/sax/features/namespaces", true);
reader.setFeature("http://xml.org/sax/features/validation", true);
reader.setFeature("http://apache.org/xml/features/validation/schema", true);
return reader;
} catch (ParserConfigurationException e) {
throw new SAXException("Can't build parser", e);
}
}
回答2:
That's how default values for attributes are supposed to work. If you need to distinguish the two cases (defaulted vs. explicitly specified but with the default value) then you'll have to remove the default from the schema and apply it at the code level instead.
来源:https://stackoverflow.com/questions/15139149/xml-sax-attributes-with-default-value