I am using a contentEditable div that allows users to edit the body HTML and then post it directly to site using an AJAX request. Naturally, I have to do some security checks o
Yes. There are an alarming number of ways that malicious code can be injected into your site.
Other answers have already mentioned all of the most obvious ones, but there are a lot of much more subtle ways to get in, and if you're going to accept user-submitted HTML code, you need to be aware of them all, because hackers don't just try the obvious stuff and then give up.
You need to check all event handling attributes - not just onclick
, but everything: onfocus
, onload
, even onerror
and onscroll
can be hacked.
But more importantly than that, you need to watch out for hacks that are designed to get past your validation. For example, using broken HTML to confuse your parser into thinking it's safe:
<!--<img src="--><img src=fakeimageurl onerror=MaliciousCode();//">
or
<style><img src="</style><img src=fakeimageurl onerror=DoSomethingNasty();//">
or
<b <script>ReallySneakyJavascript();</script>0
All of these could easily slip past a validator.
And don't forget that a real hack is likely to be more obfuscated than this. They'll make an effort to make it hard for you to spot, or to understand what it's doing it you do spot it.
I'll finish by recommending this site: http://html5sec.org/ which has details of a large number of attack vectors, most of which I certainly wouldn't have thought of. (the examples above all feature in the list)
Yes and yes.
There are A LOT of ways for users to inject scripts without script tags.
They can do it in JS handlers
<div onmouseover="myBadScript()" />
They can do it in hrefs
<a href="javascript:myBadScript()">Click me fool!!</a>
They can do it from an external source
<iframe src="http://www.myevilsite.com/mybadscripts.html" />
They can do it in ALL SORTS of ways.
I am afraid that the idea of allowing users to do this is just not a good one. Look at using Wiki markup/down instead. It'll be much safer.
Did you think about security risk from <object>
and <embed>
objects?
I'd use strip_tags()
for stripping html tags
Javascript can be called any number of ways by using the event attributes on elements, like:
<body onload="..">
A similar question posted here recommends using HTMLPurifier instead of trying to handle this on your own.