问题
While learning web dev i've stuck a problem. That's my code:
<!DOCTYPE html>
<html>
<head>
<style type="text/css">
p
{
background: blue;
}
.primary
{
color: red;
}
</style>
</head>
<body>
<p>
text1
<div class="primary">
text
</div>
text2
</p>
</body>
</html>
For some reason browsers convert <p>
to <p>text</p>
and </p>
to the same. So instead of <div>
nested in <p>
(What I've actualy writen in source!) I get this:
...
<body>
<p>
text1
</p>
<div class="primary">
text
</div>
text2
<p></p>
</body>
...
As I suppose that happens because <p>
can be both an emty element (like <br>
) and not empty element (like <div>
). Can you please explain me the problem and give a solution. Thank you.
回答1:
The <div>
tag, like <p>
is a block level element, which means that it is designed to contain it's own block of with newlines around it. Trying to nest a <div>
inside of a <p>
is not likely to do what you want as it doesn't make much sense. A <p>
is a paragraph, and it should contain no block level elements. This question may would be related:
https://stackoverflow.com/questions/4291467/nesting-block-level-elements-inside-the-p-tag-right-or-wrong
Try using <span>
instead, because <span>
is an inline element, which is designed to be displayed inside of a paragraph. If you really do need multiple block level elements there, consider not using the <p>
there at all, or using them as the inner most block level element rather than an outer element.
回答2:
In the beginning, there was the Standard Generalized Markup Language (SGML). SGML defined some aspects of the syntax like punctuation and tags, but each user application defined parts of the syntax such as tag names, attributes, nesting.
Decades later, SGML was simplified to create the XML standard. The way XML is used today for many application-specific data formats is similar to how SGML was used in the past. SGML and XML are essentially meta-languages - they are a syntax template for many application-specific languages.
HTML was initially designed as an application of SGML, hence understanding the history of HTML requires knowledge of some rules of SGML. SGML was intended to be editable in a text editor, so it included many features that reduced code to make human writing and reading more convenient. Just a few examples:
- Some elements like
<br>
are self-terminating, thus never have a corresponding</br>
end tag. - Some elements like
<tbody>
are implicitly inserted, e.g.<table><tr><td></td></tr></table>
becomes<table><tbody><tr><td></td></tr></tbody></table>
. - Some elements like
<p>
cannot nest in each other, so starting one will terminate the old one:<p><p>
becomes<p></p><p></p>
.
These element/tag-level syntax features are enabled/disabled through the SGML declaration and document type definition (DTD). HTML up to version 4.01 certainly had a DTD, and this was considered as the source of truth on how a parser should interpret markup code. The DTD can also tell us things like (not an exhaustive list):
- What attributes each element is allowed to have.
- Whether an attribute is optional, required, or has a default value.
- Distinctions between PCDATA and CDATA, which affects how characters are escaped.
- Exactly what elements are allowed to nest within what.
The DTD is where we can find our answer, at least historically speaking for HTML 4.01 Strict:
<!ELEMENT P - O (%inline;)* -- paragraph -->
<!ENTITY % inline "#PCDATA | %fontstyle; | %phrase; | %special; | %formctrl;">
<!ENTITY % fontstyle
"TT | I | B | BIG | SMALL">
<!ENTITY % phrase "EM | STRONG | DFN | CODE |
SAMP | KBD | VAR | CITE | ABBR | ACRONYM" >
<!ENTITY % special
"A | IMG | OBJECT | BR | SCRIPT | MAP | Q | SUB | SUP | SPAN | BDO">
<!ENTITY % formctrl "INPUT | SELECT | TEXTAREA | LABEL | BUTTON">
The code above says that a <p>
element can only contain %inline
content, which is further defined as any of #PCDATA
, %fontstyle
, %phrase
, %special
, %formctrl
. The definitions of the latter 4 are a set of 31 elements like <tt>
, <strong>
, <img>
, <textarea>
, etc. Notice that these so-called inline elements do not include block elements like <div>
, <ul>
, and so on - so in other words, <p>
cannot contain <div>
.
I don't know how the details of how the SGML parser behaves in every situation, but it looks like when one element is not allowed to contain another, the first element is terminated and then the second element begins. This explains why <p><div></div></p>
becomes <p></p><div></div><p></p>
.
Fast forward to HTML5, which is not based on SGML anymore. Although HTML5 is a bespoke, one-of-a-kind syntax standard, it is intended to be backward-compatible with HTML 4. HTML5 replicates the semantics of correct HTML 4 code, and additionally mandates a uniform way to parse erroneous markup code ("tag soup") so that all browsers behave the same. So the interpretation of <p><div></div></p>
is still unchanged from the SGML days.
For <p>
in particular, the rule is explained very clearly here here:
A p element's end tag can be omitted if the p element is immediately followed by an address, article, aside, blockquote, details, div, ...
Also, <p>
is only allowed to contain "phrasing content" (note the lack of <div>
):
Phrasing content is the text of the document, as well as elements that mark up that text at the intra-paragraph level. Runs of phrasing content form paragraphs. a, abbr, area (if it is a descendant of a map element), audio, b, bdi, bdo, br, button, canvas, cite, code, data, datalist, del, dfn, em, embed, i, [...], autonomous custom elements, text
来源:https://stackoverflow.com/questions/44835895/div-nested-in-p