Why indeed? Wouldn\'t something like &br;
be more appropriate?
Entities are stand-ins for other characters or bits of text. In HTML they are used to represent characters that are hard to type (e.g. —
for "—") or for characters that need to be escaped (&
for "&"). What would a hypothetical &br;
entity stand for?
It couldn't be \r
or \n
or \r\n
as these are already easy enough to type (just press enter). The issue you're trying to workaround is that HTML collapses whitespace in most contexts and treats newlines as spaces. That is, \n
is not a line break character, it is just whitespace like tabs and spaces.
An entity &br;
would have to be replaced by some other text. What character do you use to represent the concept of "hard line break"? The standard line break character \n
is exactly the right character, but unfortunately it's unsuitable since it's thrown in the generic "whitespace" bucket. You'd have to either overload some other control character to represent "hard line break", or use some extended Unicode character. When HTML was designed Unicode was only a nascent, still-developing standard, so that wasn't an option.
A <br>
element was the simple, straightforward way to add the concept of "hard line break" to a document since no character could represent that concept.
Yes. An HTML entity would be more appropriate, as a break tag cannot contain text and behaves much like a newline.
That's just not the way things are, though. Too late. I can't tell you the number of non-XML-compatible HTML documents I've had to deal with because of unclosed break tags...