I need to be able to reference named HTML entities like •
instead of the Unicode alternative •
in an XML document. I have control over some parts of the XML document, such as defining the DOCTYPE
, but doing a find-and-replace in the actual XML is not an option. I can get some elements like
and &
by including the XHTML transitional DOCTYPE, but I need to define more manually. How do I do this?
-- EDIT --
Thanks to Jim's answer, here's what I ended up with. This is great because I can utilize the XHTML transitional entities, and also add my own:
<!DOCTYPE
html
PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"
[
<!ENTITY bull "•">
<!ENTITY ldquo "“">
<!ENTITY rdquo "”">
... etc ...
]
>
If you can modify the the XML to include an inline DTD you can define the entities there:
<!DOCTYPE yourRootElement [
<!ENTITY bull "•">
....
]>
I'm not cetain, but I think the XHTML DTD's should give you quite a few entities (253):
Also in the w3 spec, there is a mention of additional DTD's for special characters etc. http://www.w3.org/TR/xhtml-modularization/dtd_module_defs.html#a_dtd_xhtml_character_entities
However I haven't been able to find an implementation example of the special character DTDs.
Edit by DevNull
Here is an extremely generic example implementation of one of the entity DTD modules. To implement, you only need to add a parameter entity pointing to the module.
<?xml version="1.0"?>
<!DOCTYPE test [
<!ELEMENT test (#PCDATA)>
<!ENTITY % xhtml-special SYSTEM "xhtml-special.ent">
%xhtml-special;
]>
<test>Here is a left double quote: “</test>
来源:https://stackoverflow.com/questions/6508860/how-do-i-define-html-entity-references-inside-a-valid-xml-document