Numerical character reference entities… Nomenclature

耗尽温柔 提交于 2019-12-13 05:47:27

问题


It used to be so simple. Or so I thought.

  • nbsp is an entity
  •   is, therefore, an entity reference (a reference to an entity)
  •   is a character reference (a reference to a numerical character value)

But these days, I read so many documents, even official ones, where those words are all mangled together; you have character entities, named character references, numerical entities, reference entities, and so on.

So what is it really? How are these things really called? Who can I trust to have it right these days?

Edit: the resolution so far is that   and   have names ending in "reference" (although what's before the "reference" varies between HTML4, HTML5 and XML). If you call them something ending in "entity", you're most likely incorrect.


回答1:


You are correct except that nbsp is not an entity but an entity name. The entity is the thing that the entity reference refers to, in this case the no-break space character.

The entity reference can also be called named entity reference (since SGML in general allows other types of entity reference, too). Similarly, the character reference can be called numeric character reference (to distinguish it from certain SGML concepts that never applied in HTML).

This is the SGML (ISO 8879) terminology that HTML specifications nominally adhere to, be their formal references to the SGML standard, up to and including HTML 4.01.

(Even HTML specifications use SGML terms sloppily, though. And in fact, HTML was never implemented as SGML-based, though some features of SGML are reflected in implementations.)

XHTML is based on XML, which is a simplification of SGML and formally defined as standalone. XML uses the terms entity reference and character reference, like SGML, but the longer names don’t apply.

HTML5 is something different: designed to be independent of SGML and XML. It also introduces its own terminology.




回答2:


I am basing this answer on the HTML5 specification, which I usually treat as trustworthy, although it is a working draft so subject to change.

nbsp is a "character reference name" (but the spec also calls it an "entity name")

  is a "named character reference"

  is a "decimal numerical character reference"

There is another option too:

† is a "hexadecimal numeric character reference"



来源:https://stackoverflow.com/questions/9737766/numerical-character-reference-entities-nomenclature

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!