When I come across a broken RSS feed, the usual reason its all blown to pieces is because line 23 says \"Sanford & Sons.\"
The most confusing thing is the fact
The & is a remainder of XML's roots in SGML. There the &...; syntax is used to escape all kinds of stuff, even whole documents to embed. Therefore if you want to use a literal "&" you have to escape it. It is the same as using quotes inside strings in any programming language.
There is no use in letting XML do some kind of error correction of the kind "If there is no letter following, output a literal &", because that would break the SGML syntax XML is, as said, based on.
That it is done so in HTML by most browsers is, because they said, that it's better for users to see anything than an SGML parse error. But this opens a whole new box of Pandora of which browser does what kind of error corrections. Look at the HTML5 spec and you'll see what it means to really define error handling. It's lots of text.
One special case: You can include a literal "&" in XML/RSS, if you enclose it in a so-called "CDATA" section. That will look like the following:
<item> <![CDATA[ Smith & Wesson ]]> </item>
Cheers,
Because RSS is XML, and XML demands certain characters be escaped, such as the ampersand.
This depends highly on the RSS client, but most likely it's attempting to XML-decode the contents (in your example "Sanford & Sons"). When that happens, & indicates an escaped character. If you don't use &
as it decodes, it will try to use the next few characters to complete the escape sequence. Odds are highly likely that it will fail.
When a 'raw' & is seen, the interpreter is looking for one of the valid escaped & sequences (such as '&'
). When an invalid sequence is found it throws an error. That's all there is to it.
Because it must be escaped in XML syntax. Same reason here.
http://myst-technology.com/public/item/11878
Not sure if this helps but when I needed to solve this problem I used the numeric entity ref for an ampersand which is & Running this through the w3c validator passed so I guess it's ok to use this.
Cheers