问题
I need to represent content in a lingua franca, that is, in nowadays, the HTML5 standard — my objective is not to show a page in the web-browser. I need to represent only content, no interface, no layout, no logic (no Javascript).
As remembered in other questions (or programmers questions), and the W3C HTML5's Recommendation, "HTML vs XHTML" section,
the DOM, the HTML syntax, and the XHTML syntax cannot all represent the same content.
Ok, but ~90% can be the same (!), and, if I not need Javascript, Styles, etc. and I can enforce some constraints, it will be 100%... So, the question is about what constraints I need to use (?) to ensure that all HTML5 serialized as XHTML5 will be represent the same thing, and vice-versa (an XSLT that will back with the original HTML5 document).
There are a "subset of HTML5 elemements" or a "subset with some aditional constraints" that ensures the reversibility of XHTML5/HTML5 convertions?
回答1:
Polyglot Markup: A robust profile of the HTML5 vocabulary, which is currently a W3C Candidate Recommendation, defines rules for a document
[…] that is a stream of bytes that parses into identical document trees (with some exceptions, as noted in the Introduction) when processed either as HTML or when processed as XML.
You can find the rules for writing such a document in section 4: Writing HTML documents.
来源:https://stackoverflow.com/questions/28419046/there-are-a-subset-or-constraint-of-html5-that-can-be-perfectly-mapped-into-xhtm