What characters are allowed in an HTML attribute name?

喜你入骨 提交于 2019-11-26 17:33:45
Alohci

It depends what you mean by "allowed". Each tag has a fixed list of attribute names which are valid, and in html they are case insensitive. In one important sense, only these characters in the correct sequence are "allowed".

Another way of looking at it, is what characters will browsers treat as a valid attribute name. The best advice here comes from the parser spec of HTML 5, which can be found here: https://html.spec.whatwg.org/multipage/syntax.html#attributes-2

It says that all characters except tab, line feed, form feed, space, solidus, greater than sign, quotation mark, apostrophe and equals sign will be treated as part of the attribute name. Personally, I wouldn't attempt pushing the edge cases of this though.

Assuming you're talking about XHTML, the XML rules apply.

See http://www.w3.org/TR/2008/REC-xml-20081126/#NT-Name

Names and Tokens

[4]     NameStartChar      ::=      ":" | [A-Z] | "_" | [a-z] | [#xC0-#xD6] | [#xD8-#xF6] | [#xF8-#x2FF] | [#x370-#x37D] | [#x37F-#x1FFF] | [#x200C-#x200D] | [#x2070-#x218F] | [#x2C00-#x2FEF] | [#x3001-#xD7FF] | [#xF900-#xFDCF] | [#xFDF0-#xFFFD] | [#x10000-#xEFFFF]
[4a]    NameChar       ::=      NameStartChar | "-" | "." | [0-9] | #xB7 | [#x0300-#x036F] | [#x203F-#x2040]
[5]     Name       ::=      NameStartChar (NameChar)*
[6]     Names      ::=      Name (#x20 Name)*
[7]     Nmtoken    ::=      (NameChar)+
[8]     Nmtokens       ::=      Nmtoken (#x20 Nmtoken)*

Maybe I'm missing something, but I believe the question is based on a false assumption. In HTML, attributes are strictly defined according to a fixed specification. If you 'make up' your own attribute names, you are no longer writing valid HTML.

Quentin

The values allowed are listed at w3.org. If you add a custom attribute, then you aren't writing HTML any more.

Since this question was asked, the web has evolved quite a bit. It's likely that authors of Web Components (custom elements) are landing here trying to learn what valid names can be used when defining attributes on custom elements.

There are several answers here that are partially correct, so I'm going to try to aggregate them and update them based on recent specs.

First, in HTML5, attribute names can start with most characters and are much more permissive than in previous versions of HTML. @S.Lott 's answer is correct for HTML 2 and XHTML, but not for HTML5.

For HTML5: (spec)

Attribute names must consist of one or more characters other than the space characters, U+0000 NULL, U+0022 QUOTATION MARK ("), U+0027 APOSTROPHE ('), U+003E GREATER-THAN SIGN (>), U+002F SOLIDUS (/), and U+003D EQUALS SIGN (=) characters, the control characters, and any characters that are not defined by Unicode. In the HTML syntax, attribute names, even those for foreign elements, may be written with any mix of lower- and uppercase letters that are an ASCII case-insensitive match for the attribute's name.

That being said, other commenters here are correct, when using an attribute on a built-in element that's not in it's list of valid attributes, you're technically violating the spec. Browser authors have a lot of tolerance for this though, so in practice it doesn't do (much?) harm. A lot of libraries exploit this to enhance regular HTML tags, which causes some confusion, since it's technically not valid HTML. HTML5 provides a mechanism for custom data in attributes by using the data- attribute naming convention.

These rules are different for custom elements.

Custom element authors are welcome to implement any sort of attribute they like to their element, the names of the attributes are more restrictive than HTML5 though. In fact, the spec requires that the attribute name follow the XML Name restrictions:

The ASCII symbols and punctuation marks, along with a fairly large group of Unicode symbol characters, are excluded from names because they are more useful as delimiters in contexts where XML names are used outside XML documents; providing this group gives those contexts hard guarantees about what cannot be part of an XML name. The character #x037E, GREEK QUESTION MARK, is excluded because when normalized it becomes a semicolon, which could change the meaning of entity references.

Names and Tokens

[4] NameStartChar ::= ":" | [A-Z] | "_" | [a-z] | [#xC0-#xD6] | [#xD8-#xF6] | [#xF8-#x2FF] | [#x370-#x37D] | [#x37F-#x1FFF] | [#x200C-#x200D] | [#x2070-#x218F] | [#x2C00-#x2FEF] | [#x3001-#xD7FF] | [#xF900-#xFDCF] | [#xFDF0-#xFFFD] | [#x10000-#xEFFFF]

[4a] NameChar ::= NameStartChar | "-" | "." | [0-9] | #xB7 | [#x0300-#x036F] | [#x203F-#x2040]

[5] Name ::= NameStartChar (NameChar)*

[6] Names ::= Name (#x20 Name)*

[7] Nmtoken ::= (NameChar)+

[8] Nmtokens ::= Nmtoken (#x20 Nmtoken)*

So, for custom element names you can use upper/lower alphanumeric, "_" underscore, ":" colon, or any of the unicode characters called out in the spec, as a start character, then dashes "-", dots ".", alpha etc... as body characters.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!