HTML specification for rendering new lines?

孤人 提交于 2020-08-02 06:34:50

问题


I'm trying to render some simple HTML documents (contain mostly div and br tags) to plain text, but I'm struggling on when to add new lines. I assumed it would be quite simple with <div> and <br/> generating new lines, but it looks like there's various subtle rules. For example:

<div>one line</div>
<div>two lines</div>

<hr/>

<div>one line</div>
<div></div>
<div>still two lines because the empty div doesn't count</div>

<hr/>

<div>one line<br/></div>
<div></div>
<div>still two lines because the br tag is ignored</div>

<hr/>

<div>one line<br/></div>
<div><br/></div>
<div>three lines this time because the second br tag is not ignored</div>

<hr/>

<div><div>Wrapped tags generate only one new line<br/></div></div>
<div><br/></div>
<div>three lines this time because the second br tag is not ignored</div>

So I'm looking for a specification on how new lines should be rendered in HTML documents (when no CSS is applied). Any idea where I could find this kind of document?


回答1:


If you are looking for the specification for <div> and <br>, you won't find it in one place, because each of them follow separate rules. DIV elements follow the block formatting rules, while BR elements follow the text flow rules.

I believe that the cause of your confusion is the assumption that they follow the same new lines rule. Let me explain.

The BR element.

BR is defined in HTML4 Specification Section 9.3 regarding Lines and Paragraphs:

The BR element forcibly breaks (ends) the current line of text.

And in HTML5 Specification Section 4.5 regarding Text-level semantics:

The <br> element represents a line break.

The specification explains the result your third example:

<div>one line<br/></div>
<div></div>
<div>still two lines because the br tag is ignored</div>

There, the BR element is not ignored at all, because it marks that the line must be broken at that point. In other words, it marks the end of the current line of text. It is not about creating new lines.

In your fourth example:

<div>one line<br/></div>
<div><br/></div>
<div>three lines this time because the second br tag is not ignored</div>

the BR elements also marks the end of the line. Because the line has zero characters, it is rendered as an empty line.

Therefore, the rule is the same in your third and fourth example. Nothing is ignored.

The DIV element.

In the absence of explicit style sheet, the default style applies. A DIV element is by default a block-level element which means it follows the block formatting context defined in CSS Specification Section 9.4.1:

In a block formatting context, boxes are laid out one after the other, vertically, beginning at the top of a containing block.

Therefore, this is also not about creating new lines because in a block formatting context, there is no notion of lines. It is about placing block elements one after another from top to bottom.

In your second example:

<div>one line</div>
<div></div>
<div>still two lines because the empty div doesn't count</div>

the empty DIV has zero height, therefore it has no effect on the rendering of the next block-level element.

In your fifth example:

<div><div>Wrapped tags generate only one new line<br/></div></div>
<div><br/></div>
<div>three lines this time because the second br tag is not ignored</div>

the outer DIV functions as a containing block as defined in Section 9.1.2 and the inner DIV is defined Section 9.4.1 that I have quoted above. Because no CSS is applied, a DIV element by default has zero margin and zero padding, which makes every edge of the inner DIV touches the corresponding edges the outer DIV. In other words, the inner DIV is rendered at exactly the same place as the outer DIV.

I believe that's everything.




回答2:


<div>one line</div>
<div></div>
<div>still two lines because the empty div doesn't count</div>

I wouldn't say that the second div doesn't count, to be more precise, it has default block width of 100% but 0px of height due to being empty. Obviously, there's no padding and margin either but it's still technically there. It counts.

<div>one line<br/></div>
<div></div>
<div>still two lines because the br tag is ignored</div>

br tag isn't ignored either, it has done it's job of creating a line break within the current line of text within the parent block level div. Emphasized wording is directly from the docs. Note it mentions the current line of text only. It doesn't create the next line, it creates a break that may lead to a new line if there is content.

There simply isn't any text after it to be placed on the second line. Thus, the next div is created right below and abides by the rules mentioned above.

<div>one line<br/></div>
<div><br/></div>
<div>three lines this time because the second br tag is not ignored</div>

Building on the previous logic, none of the br tags are ever ignored. Both of the tags in this example are actually creating a new line break within their parent block level div elements.

These br tags are acting like a marker that states "from this point till the end of the line, within my parent block level element, will not be any inline content allowed". However, in all of these cases there's nothing to be placed on the next line.

The next div, being a block level element basically resets that behavior. The previous breaks are contained within their lines of text and their parent block level elements. We knows this because a line of text can not stretch between two block level elements.

In regard to your comment on another answer.

Block level elements do always start on a new line. As explained above, an empty div does exist and does start on a new line, it simply has 0 height. If you have two nested, empty div elements they both start on the same new line because they are both empty block level elements without any content that creates lines. If you add text to a parent div before the child div it will get pushed to a new line. Think of it as the same line of text if it helps. For example:

Same line:

<div>
    <div>
        bar
    </div>
</div>

Different lines:

<div>
    foo
    <div>
        bar
    </div>
</div>



回答3:


  • <DIV> = division. It's a block of potentially mixed content.
  • <BR> = break. Just a line break.
  • <P> = paragraph.

If you want to create a document like a word processor then <P> is the way to go.

Lots of new developers seem to struggle with this when implementing tinyMCE the first few times. Hitting [enter] creates a <P>, while [shift]+[enter] creates a <br>. Exactly like a word processor.




回答4:


What you are missing here I guess is that div is a block-level element and thus always start a new line (without CSS). Concerning the empty div I think since there is nothing to display, it will not render any new line; it may also depend on your browser implementation of the HTML standard.

You can find more information on block or inline HTML element here here




回答5:


A block level element will always start on a new line unless it is the immediate first child of another element.

In your example #2

<div>one line</div>
<div></div>
<div>still two lines because the empty div doesn't count</div>

The lines are three, but they appear as if they were two because of the absence of visual content in the second div. You can define custom margins and borders to get a visual on that.

A br element will always break the content flow and the node afterwards will start on a new line, regardless of whether that node happens to be a block-level element or not.




回答6:


For your second example, you can put &nbsp; inside the <div> so that it's rendered as empty line. Also for your fourth example, you can put the double br in the first div.

However, I'm not aware of any specification on this.

<div>one line</div>
<div>&nbsp;</div>
<div>still two lines because the empty div doesn't count</div>

<hr/>

<div>one line<br/><br/></div>
<div>three lines this time because the second br tag is not ignored</div>



回答7:


In your question you are saying that a <br/> tag in between two divs are ignoring. But your snippet seems buggy. Actually it wont ignore. I have corrected the snipped. It is the right way of inserting a new line in between, without using css

<div>one line</div>
<div>two lines</div>

<hr/>

<div>one line</div>
<div></div>
<div>still two lines because the empty div doesn't count</div>

<hr/>

<div>one line</div>
<br/>
<div>Three lines because the br tag is not ignored</div>

<hr/>

<div>one line</div>
<div><br/></div>
<div>three lines this time because the second br tag is not ignored</div>

<hr/>

<div><div>Wrapped tags generate only one new line<br/></div></div>
<div><br/></div>
<div>three lines this time because the second br tag is not ignored</div>



回答8:


How about letting the jQuery engine render the HTML to text? Take a look at the snippet below, if you click "Run" you'll see an alert box which displays just the text:

var sample = $("#sample").text();
alert(sample);
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
<html>
<head/>

<body>

  <div id="sample">

    <div>one line</div>
    <div>two lines</div>

    <hr/>

    <div>one line</div>
    <div></div>
    <div>still two lines because the empty div doesn't count</div>

    <hr/>

    <div>one line
      <br/>
    </div>
    <div></div>
    <div>still two lines because the br tag is ignored</div>

    <hr/>

    <div>one line
      <br/>
    </div>
    <div>
      <br/>
    </div>
    <div>three lines this time because the second br tag is not ignored</div>

    <hr/>

    <div>
      <div>Wrapped tags generate only one new line
        <br/>
      </div>
    </div>
    <div>
      <br/>
    </div>
    <div>three lines this time because the second br tag is not ignored</div>

  </div>
</body>

</html>

You can use the content of the variable sample to process it further, for example submit it to an AJAX method.

If you run it, you will recognize that all of the tags are regarded - it is just a matter of how the style defaults are defined. Having said that, I believe you can't disregard the styles completely, because it does matter - even if you don't specify it there will be some style assumed and applied.

What you get from $("#sample").text(); is just the line breaks and plain text, which is what I understood from your question you wanted to achieve.




回答9:


According to the spec only the <br> and <wbr> elements are meant for line break:

  • <br> elements must be used only for line breaks that are actually part of the content, as in poems or addresses.
  • <br> elements must not be used for separating thematic groups in a paragraph (Just se another <p> element).

You can also use <wbr> (more info here)

You can find more info at the spec itself. (Single page version to better search) https://www.w3.org/TR/html/single-page.html#elementdef-br

PD: Certain attributes accept LF (U+000A) like the title attribute in the <abbr> tag.

In the end any empty block element would do the job. (without CSS) The full list is here



来源:https://stackoverflow.com/questions/41014077/html-specification-for-rendering-new-lines

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!