Is it possible to remove script tags in the of an HTML document client-side and prior to execution of those tags?
On the server-side I am able
You could try to use the DOM Mutation events:
DOMAttrModified
DOMAttributeNameChanged
DOMCharacterDataModified
DOMElementNameChanged
DOMNodeInserted
DOMNodeInsertedIntoDocument
DOMNodeRemoved
DOMNodeRemovedFromDocument
DOMSubtreeModified
like so:
document.head.addEventListener ('DOMNodeInserted', function(ev) {
if (ev.target.tagName == 'SCRIPT') {
ev.target.parentNode.removeChild(ev.target);
}
}, false);
Also you can try the new way of doing this through MutationObserver
Since you cannot prevent future <script>
tags from evaluating (whenever the </script>
tag has been found, the corresponding code of <script>
is fetched and evaluated. <script src>
will block a document from loading further till the source is fetched unless the async
attribute is set), a different approach need to be taken.
Before I present the solution, I ask: What can prevent a script within a <script>
tag from executing? Indeed,
<script>
from the source code.1 is obvious, and 2 can be derived from the documentation, so I'll focus on 3. The examples below are obvious, and need to be adjusted for real-world use cases.
Here's a general pattern for proxying existing methods:
(function(Math) {
var original_method = Math.random;
Math.random = function() {
// use arguments.callee to read source code of caller function
if (/somepattern/.test(arguments.callee.caller)) {
Math.random = original_method; // Restore (run once)
throw 'Prevented execution!';
}
return random.apply(this, arguments); // Generic method proxy
};
})(Math);
// Demo:
function ok() { return Math.random(); }
function notok() { var somepattern; return Math.random(); }
In this example, the code-blocker runs only once. You can remove the restoration line, or add var counter=0;
and if(++counter > 1337)
to restore the method after 1337 calls.
arguments.callee.caller
is null
if the caller is not a function (eg. top-level code). Not a disaster, you can read from the arguments or the this
keyword, or any other environment variable to determine whether the execution must be stopped.
Demo: http://jsfiddle.net/qFnMX/
Here's a general pattern for breaking setters:
Object.defineProperty(window, 'undefinable', {set:function(){}});
/*fail*/ function undefinable() {} // or window.undefinable = function(){};
Demo: http://jsfiddle.net/qFnMX/2/
And getters, of course:
(function() {
var actualValue;
Object.defineProperty(window, 'unreadable', {
set: function(value) {
// Allow all setters for example
actualValue = value;
},
get: function() {
if (/somepattern/.test(arguments.callee.caller)) {
// Restore, by deleting the property, then assigning value:
delete window.unreadable;
window.unreadable = actualValue;
throw 'Prevented execution!';
}
return actualValue;
},
configurable: true // Allow re-definition of property descriptor
});
})();
function notok() {var somepattern = window.unreadable; }
// Now OK, because
function nowok() {var somepattern = window.unreadable; }
function ok() {return unreadable;}
Demo: http://jsfiddle.net/qFnMX/4/
And so on. Look in the source code of the scripts you want to block, and you should be able to create a script-specific (or even generic) script-breaking pattern.
The only downside of the error-triggering method is that the error is logged in the console. For normal users, this should not be a problem at all.
Right, had another slightly less mad idea than my first, but it does depend on exactly what control you have on being able to insert tags in the head of the pages:
Put simply, if you can insert a <noscript>
tag like I have below before any of the <script>
declarations in the head, and you can then append a </noscript>
tag to the end of the head, along with the final script snippet - you should be able to do whatever you want with the markup between the noscript tags before it is written back to the page.
The nice thing about this approach is that script-disabled agents will just ignore and parse the markup, but script-enabled agents will store the content up but not use it... exactly what is needed.
Whilst this is designed to be used with the head, it could easily be used the same way in the body, although it would have to be a separate implementation. This is because it has to work with a balanced and complete node tree, due to the nature of tags (unless you can manage to wrap the entire markup in noscript?!?).
It's not full-proof, because scripts can lie outside of the head and body tags - at least before they are parsed - but it seems to work pretty confidently on everything I've tested so far... and it doesn't rely on a smattering of randomly ajax-powered code that'll break at the first sign of a browser update ;)
Plus I also like the idea of script tags within noscript tags...
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8" />
<noscript id="__disabled__">
<script src="jquery.js"></script>
<title>Another example</title>
<script>alert(1);</script>
<link rel="stylesheet" type="text/css" href="core.css" />
<style>body { background: #ffffd; }</style>
</noscript>
<script>
(function(){
var noscript = document.getElementById('__disabled__');
if ( noscript ) {
document.write(
String(noscript.innerHTML)
/// IE entity encodes noscript content, so reverse
.replace(/>/gi,'>')
.replace(/</gi,'<')
/// simple disable script regexp
.replace(/<script[^>]*>/gi,'<'+'!--')
.replace(/<\/script>/gi,'//--'+'>')
);
}
})()
</script>
</head>
Ok so I have yet to test any of this in Internet Explorer (I doubt it'll work), and don't berate me for the horribleness of the hacks... I know ;) but it does seem to work in FireFox, Safari, Chrome and Opera on Mac OSX - the recent public releases of those useragents, at least. I'll see if I can improve it when I get access to a windows machine... although I don't hold much hope for IE.
(function(xhr,d,de){
d = document;
try{
de = ((de = d.getElementsByTagName('html')[0])
? de : ( d.documentElement ? d.documentElement : d.body ));
/// this forces firefox to reasses it's dom
d.write(' ');
/// make an ajax request to get the source of this page as a string
/// this could be improved, I've just chucked it in as an example
if (window.XMLHttpRequest) {
xhr = new window.XMLHttpRequest;
}else{
xhr = new ActiveXObject("MSXML2.XMLHTTP");
}
if ( xhr ) {
/// open non-async so the browser has to wait
xhr.open('GET', window.location, false);
xhr.onreadystatechange = function (e,o,ns){
/// when we've got the source of the page... then
if ((o = e.target) && (o.readyState == 4) && (o.status == 200)) {
/// remove the script tags
window.ns = ns = String(o.responseText)
.replace(/<script[^>]*>/gi,'<'+'!--')
.replace(/<\/script>/gi,'//--'+'>');
/// fix for firefox - this causes a complete
/// rewrite of the main docelm
if ( 'MozBoxSizing' in de.style ) {
de.innerHTML = ns;
}
/// fix for webkit, this seems to work, whereas
/// normal document.write() doesn't. Probably
/// because the window.location resets the document.
else {
window.location = 'javascript:document.write(window.ns);';
}
}
};
xhr.send({});
}
}
catch(ex){}
})();
Just to say I've tested this with nearly every type of script tag I can think of, placed where ever I could place them. And I haven't yet had one manage to break through. As I said, fun question... although I don't know how well the above would operate in a production environment :S ;)
Basically this will have to be placed as a script tag right at the top of the head tag.
A test example:
http://pebbl.co.uk/stackoverflow/12748067.html
No you can't
I cannot find official documentation right now, but as I'm reading on High Performance Javascript from Nicholas Zakas, when the render engine founds a Tag script, it stops HTML rendering (so no other node is created), downloads the script and executes it. Then it continues rendering the HTML. That's why when you execute "document.write()" on a tag, the result is added JUST after the tag, then the rest of the page is rendered.
(I don't know if I can insert a paragraph of the book here...)
So it's not like rendering the page, then you remove the node and the script wont be executed, when the browser founds a tag you cannot do anything until this code is executed.
We had a very similar problem at our product, we added a script tag to the DOM and we needed some code to be executed JUST before the new tag execution starts, after a week of research we had to find another solution.
Sorry, but I hope you don't waste so much time as we did. Anyway I'll keep looking for browser specification.