What is the difference between HTML and XHTML?

前端未结

关注

 1  968

Note: this is supposed to be the canonical post for this question. A number of answers exist already, but descriptions of the various differences are scattered all over the

相关标签:

1条回答

余生分开走

2021-01-25 11:25

What is the difference between HTML and XHTML?

There are many differences. The main one is that XHTML is HTML in an XML document, and XML has different syntax rules:

XML has a different namespace by default, so you'll have to use the HTML namespace, xmlns="http://www.w3.org/1999/xhtml" explicitly in an XHTML document

XML is case sensitive and you'll have to use lowercase for tag names and attributes and even the x in hexadecimal character references

XML doesn't have optional start and end tags, so you'll have to write out all of them in full

Likewise, XML doesn't have void tags, so you'll have to close every void element yourself with a slash.

Non-void elements that have no content can be written as a single empty element tag in XML.

XML can contain CDATA sections, sections of plain text delimited with <![CDATA[ .. ]]>; HTML cannot

On the other hand, there are no CDATA or PCDATA elements or attributes in XML, so you'll have to escape your < signs everywhere (except in CDATA sections)

Quotes around attribute values are not optional in XML, and there is no attribute minimization (name-only attributes)

And the XML parser is not as forgiving of errors as the HTML parser.

Then there are a couple of not XML-related differences:

XHTML documents are always rendered in standards mode, never in quirks mode

XHTML does not look at meta commands in the head to determine the encoding. In fact, the W3C validator flags <meta http-equiv="content-type" ... as an error in XHTML5 files, but not in HTML5 files.

Earlier on, mismatches between the dtds for XHTML 1.0 strict and HTML 4.01 strict lead to validation issues. The definition for XTHML 1.0 was missing the name attribute on <img> and <form>. This was an error though, fixed in XHTML 1.1.

Note that XHTML documents should be served up with the correct file type, i.e. a .xhtml file extension or an application/xhtml+xml MIME type. You can't really have XHTML in an HTML document, because browsers don't differentiate between the two syntaxes by looking at the content, only by file type.
In other words, if you have an HTML file, its contents are HTML, no matter if it has valid XML in it or not.

One point about the syntax rules worth mentioning is the casing of tag names. Although HTML documents are case-insensitive, the tag names are actually exposed as uppercase by the DOM. That means that under HTML, a JavaScript command like console.log(document.body.tagName); would output "BODY", whereas the same command under XHTML would output "body".

Isn't XHTML merely a stricter version of HTML?

No; XML has different rules than HTML, but it's not necessarily stricter. If anything, XML has fewer rules!

In HTML, many features are optional. You can choose to put quotes around attribute values or not; in XML you don't have that choice. And in HTML, you have to remember when you have the choice and when you don't: are quotes optional in <a href=http://my-website.com/?login=true>? In XML, you don't have to think about that. XML is easier.

In HTML, some elements are defined as raw text elements, that is, elements that contain plain text rather than markup.
And some other elements are escapable raw text elements, in which references like é will be parsed, but things like <b>bold</b> and  will be treated as plain text. If you can remember which elements those are, you don't have to escape < signs (you optionally can though). XML doesn't have that, so there's nothing to remember and all elements have the same content type.

XML has processor instructions, the most well known of which is the xml declaration in the prolog, <?xml version="1.0" encoding="windows-1252"?>. This tells the browser which version of XML is used (1.0 is the only version that works, by the way) and which character set.

And XML parses comments in a different way. For example, HTML comments can't start with . Don't try that in HTML. (It's not recommended in XHTML either, because of compatibility issues, but you can.)

Why are there different versions of XHTML if they all act the same?

They don't! For instance, in XHTML 1.1 you can refer to character entities like é and  , because those entities are defined in the DTD. The current version of XHTML (formerly known as XHTML5) does not have a DTD, so you will have to use numerical references, in this case é and   (or, define those entities yourself in the DOCTYPE declaration. The X means eXtensible after all).

0 讨论(0)

发布评论:

提交评论

加载中...

验证码

看不清?

提交回复