I\'m parsing an HTML document with a couple Perl modules: HTML::TreeBuilder and HTML::Element. For some reason whenever the content of a tag is just
,
The character is non-breaking space which is what
stands for:
In word processing and digital typesetting, a non-breaking space ("
") (also called no-break space, non-breakable space (NBSP), hard space, or fixed space) is a space character that prevents an automatic line break at its position. In some formats, including HTML, it also prevents consecutive whitespace characters from collapsing into a single space.
In HTML, the common non-breaking space, which is the same width as the ordinary space character, is encoded as or . In Unicode, it is encoded as
U+00A0
.