问题
I have an issue where I need to prepend a DTD containing ENTITYs bracketed in the definition to an existing XML document.
For example, working from the specification for MathML in DAISY at http://www.daisy.org/projects/mathml/mathml-in-daisy-spec.html, say I am given this XML by an outside source:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE dtbook PUBLIC "-//NISO//DTD dtbook 2005-2//EN"
"http://www.daisy.org/z3986/2005/dtbook-2005-2.dtd">
<dtbook xmlns="http://www.daisy.org/z3986/2005/dtbook/" xmlns:m="http://www.w3.org/1998/Math/MathML"
version="2005-3" xml:lang="eng">
<m:math xmlns:dtbook="http://www.daisy.org/z3986/2005/dtbook/"
id="math0001" dtbook:smilref="nativemathml.smil#math0001"
altimg="nativemathml0001.png"
alttext="sigma-summation UnderScript i equals zero OverScript infinity EndScripts x Subscript i">
<m:mrow>
<m:mstyle displaystyle='true'>
<m:munderover>
<m:mo>∑</m:mo>
<m:mrow>
<m:mi>i</m:mi><m:mo>=</m:mo><m:mn>0</m:mn>
</m:mrow>
<m:mi>∞</m:mi>
</m:munderover>
<m:mrow>
<m:msub>
<m:mi>x</m:mi>
<m:mi>i</m:mi>
</m:msub>
</m:mrow>
</m:mstyle>
</m:mrow>
</m:math>
</dtbook>
I want to add the ENTITY definitions from the specification to make this book support MathML, so that the result looks like this:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE dtbook PUBLIC "-//NISO//DTD dtbook 2005-2//EN"
"http://www.daisy.org/z3986/2005/dtbook-2005-2.dtd"
[
<!ENTITY % MATHML.prefixed "INCLUDE" >
<!ENTITY % MATHML.prefix "m">
<!ENTITY % MATHML.Common.attrib
"xlink:href CDATA #IMPLIED
xlink:type CDATA #IMPLIED
class CDATA #IMPLIED
style CDATA #IMPLIED
id ID #IMPLIED
xref IDREF #IMPLIED
other CDATA #IMPLIED
xmlns:dtbook CDATA #FIXED 'http://www.daisy.org/z3986/2005/dtbook/'
dtbook:smilref CDATA #IMPLIED"
>
<!ENTITY % mathML2 PUBLIC "-//W3C//DTD MathML 2.0//EN"
"http://www.w3.org/Math/DTD/mathml2/mathml2.dtd"
>
%mathML2;
<!ENTITY % externalFlow "| m:math">
<!ENTITY % externalNamespaces "xmlns:m CDATA #FIXED
'http://www.w3.org/1998/Math/MathML'">
]
>
<dtbook xmlns="http://www.daisy.org/z3986/2005/dtbook/" xmlns:m="http://www.w3.org/1998/Math/MathML"
version="2005-3" xml:lang="eng">
<m:math xmlns:dtbook="http://www.daisy.org/z3986/2005/dtbook/"
id="math0001" dtbook:smilref="nativemathml.smil#math0001"
altimg="nativemathml0001.png"
alttext="sigma-summation UnderScript i equals zero OverScript infinity EndScripts x Subscript i">
<m:mrow>
<m:mstyle displaystyle='true'>
<m:munderover>
<m:mo>∑</m:mo>
<m:mrow>
<m:mi>i</m:mi><m:mo>=</m:mo><m:mn>0</m:mn>
</m:mrow>
<m:mi>∞</m:mi>
</m:munderover>
<m:mrow>
<m:msub>
<m:mi>x</m:mi>
<m:mi>i</m:mi>
</m:msub>
</m:mrow>
</m:mstyle>
</m:mrow>
</m:math>
</dtbook>
In Ruby, there is a method in Nokogiri that can be used to add these ENTITY definitions that looks like this: Nokogiri::XML::EntityDecl.new("MATHML.prefixed", doc, MATHML_ENTITY_DECL_TYPE, nil, nil, "INCLUDE")
Is there an equivalent to this in Java? We are using JDOM to manipulate our XML documents, but the JDOM DocType class doesn't appear to support these entity definitions.
回答1:
With JDOM you should be able to parse the original document, and pull the DTDContent node from the document.
Your code would look something like:
Document doc = saxBuilder.build(myxmlfile);
DocType dtd = doc.getDocType();
That dtd
content should be the reference to the dtbook
reference.
You can now take the string-representation of the mathml declarations, and include them as an internalsubset of the DocType (perhaps you want to read it from a file, or as a System resource, or something).
String internal = " <!ENTITY % MATHML.prefixed \"INCLUDE\" >\n"
+ " <!ENTITY % MATHML.prefix \"m\">\n"
+ ......
dtd.setInternalSubset(internal);
See: http://www.jdom.org/docs/apidocs/org/jdom2/DocType.html#setInternalSubset(java.lang.String)
This will modify the declaration, and, if you output the XML, you should have the content you expect:
XMLOutputter xout = new XMLOutputter();
xout.output(doc, System.out);
来源:https://stackoverflow.com/questions/40293841/is-there-some-equivalent-in-java-to-rubys-nokogirixmlentitydecl