I see the following HTML code used a lot to load jQuery from a content delivery network, but fall back to a local copy if the CDN is unavailable (e.g. in the Modernizr docs)
When the browser sees </script>
, it considers this to be the end of the script block (since the HTML parser has no idea about JavaScript, it can't distinguish between something that just appears in a string, and something that's actually meant to end the script element). So </script>
appearing literally in JavaScript that's inside an HTML page will (in the best case) cause errors, and (in the worst case) be a huge security hole.
That's why you somehow have to prevent this sequence of characters to appear. Other common workarounds for this issue are "<"+"/script>"
and "<\/script>"
(they all come down to the same thing).
While some consider this to be a "bug", it actually has to happen this way, since, as per the specification, the HTML part of the user agent is completely separate from the scripting engine. You can put all kinds of things into <script>
tags, not just JavaScript. The W3C mentions VBScript and TCL as examples. Another example is the jQuery template plugin, which uses those tags as well.
But even within JavaScript, where you could suggest that such content in strings could be recognized and thus not be treated as ending tags, the next ambiguity comes up when you consider comments:
<script type="text/javascript">foo(42); // call the function </script>
– what should the browser do in this case?
And finally, what about browsers that don't even know JavaScript? They would just ignore the part between <script>
and </script>
, but if you gave different semantics to the character sequence </script>
based on the browsers knowledge of JavaScript, you'd suddenly have two different results in the HTML parsing stage.
Lastly, regarding your question about substituting all angle brackets: I'd say at least in 99% of the cases, that's for obfuscation, i.e. to hide (from anti-virus software, censoring proxies (like in your example (nested parens are awesome)), etc.) the fact that your JavaScript is doing some HTML-y stuff. I can't think of good technical reasons to hide anything but </script>
, at least not for reasonably modern browsers (and by that, I mean pretty much anything newer than Mosaic).
Some parsers handle the <
version as the closing tag and interpret the code as
<script>
window.jQuery || document.write('<script src="js/libs/jquery-1.6.1.min.js">
</script>
\x3C
is hexadecimal for <
. Those are interchangable within the script.