Integrate schema metatdata during XML Parsing with Xerces C++

I want to parse an XML file and look up the datatype of attributes and entities in an XML schema file (.xsd) when I traverse the DOM.

I found out that I can use the post schema validation infoset (PSVI) to get that information. For this I should be able to get a nodes info by the getFeature method:

info = (xercesc::DOMPSVITypeInfo*) domNode->getFeature(xercesc::XMLUni::fgXercesDOMHasPSVIInfo, xercesc::XMLUni::fgVersion1_1);

However I first seem to have to enable this feature. As there is no setFeature method in the parser object I tried "useImplementation", but this just crashes my program.

As the documentation of Xerces is pretty sparse in respect to PSVI, maybe someone here knows how to get schema information while parsing an XML document using a XercesDOMParser.

Thanks in advance!


You should be able to configure the DOMLSParser through its DOMConfiguration, (see the getDomConfig() function) and avoid the cast to the implementation class. The DOMConfiguration has a couple setParameter() functions which should support Xerces' many configuration properties, including those for XML Schema validation.


I found a solution meanwhile:

//create parser
static const XMLCh gLS[] = { xercesc::chLatin_L, xercesc::chLatin_S, xercesc::chNull };
xercesc::DOMImplementation *impl = xercesc::DOMImplementationRegistry::getDOMImplementation(gLS);
DOMLSParserImpl* parser = dynamic_cast<DOMLSParserImpl*>(impl->createLSParser(DOMImplementationLS::MODE_SYNCHRONOUS, 0));

parser->setParameter(xercesc::XMLUni::fgXercesDOMHasPSVIInfo, true);  //collect schema info
parser->setParameter(xercesc::XMLUni::fgDOMComments, false); //discard comments



xercesc::DOMAttr& attr = (xercesc::DOMAttr&) attributeNode;
cout << " name: " << transcode(attr.getName()) << " type: " << transcode(attr.getSchemaTypeInfo()->getTypeName()) << ", ";

It's a bit messy to cast the the parser down to the Impl class, but it's the only way I found to access the setParameter function. I think there must be a "correct" way to initialize the parser, though...

Also it's important to set the validation scheme and set DoSchema to true, otherwise the parser won't attach the schema information the the DOM elements.

