Recently I was editing someone else\'s code and I noticed that spaces were used something like the following:
Some text &
As far as I know the rule is simple: more than 1 white-space each after another is always rendered as 1 white-space. If you want to render more, you need to use
entity.
So if you have code ([whitespace] is here standard whitespace)
[whitespace][whitespace]
browser renders only one whitespace
but when you have
[whitespace] [whitespace]
browser will render 3 white-spaces because normal white-spaces are separated by extra
This rule doesn't apply to entity version so if you have
2 white-spaces will be generated
Browsers only collapse consecutive regular space characters. Text rendering is mostly governed by the CSS spec rather than the HTML spec (with exceptions); with respect to whitespace collapsing, section 16.6.1 of CSS2.1 has the details. Specifically:
any space (U+0020) following another space (U+0020) — even a space before the inline, if that space also has 'white-space' set to 'normal', 'nowrap' or 'pre-line' — is removed.
Since there's a non-breaking space separating every two space characters that would otherwise be consecutive (and non-breaking spaces are not considered "regular" space characters), the browser has no opportunity to collapse any of them, and so must render all of them in sequence.
The behavior across browsers is mostly identical, except for a nasty bug in Chrome regarding the part about "a space before the inline".
From the W3C Recommendation:
4.7. White Space handling in attribute values
When user agents process attributes, they do so according to Section 3.3.3 of [XML]:
Strip leading and trailing white space. Map sequences of one or more white space characters (including line breaks) to a single inter-word space. For whitespace in between tags, see the section 3.2 criteria 9:
3.2. User Agent Conformance
[1-8 snipped]
SPACE ( ) HORIZONTAL TABULATION ( ) CARRIAGE RETURN ( ) LINE FEED ( ) The XML processor normalizes different systems' line end codes into one single LINE FEED character, that is passed up to the application.
The user agent must use the definition from CSS for processing whitespace characters [CSS2]. Note that the CSS2 recommendation does not explicitly address the issue of whitespace handling in non-Latin character sets. This will be addressed in a future version of CSS, at which time this reference will be updated.
Also see section C.15:
C.15. White Space Characters in HTML vs. XML
Some characters that are legal in HTML documents, are illegal in XML document. For example, in HTML, the Formfeed character (U+000C) is treated as white space, in XHTML, due to XML's definition of characters, it is illegal.