I\'m looking for a simple HTML sanitizer written in JavaScript. It doesn\'t need to be 100% XSS secure.
I\'m implementing Markdown and the WMD Markdown editor (The S
for my function I've only cared that the string is not empty and contains only alphanumeric characters. This uses plain JS and no third libraries or anything. It contains a long regex, but it does the job ;) You could build on this but have your regex be something more alike '< script >|< /script >' (with characters escaped where necessary, and minus the spaces). ;)
var validateString = function(string) {
var validity = true;
if( string == '' ) { validity = false; }
if( string.match( /[ |<|,|>|\.|\?|\/|:|;|"|'|{|\[|}|\]|\||\\|~|`|!|@|#|\$|%|\^|&|\*|\(|\)|_|\-|\+|=]+/ ) != null ) {
validity = false;
}
return validity;
}
You should have a look at the one recommended in this question Sanitize/Rewrite HTML on the Client Side
And just to be sure that you don't need to do more about XSS, please review the answers to this one How to prevent Javascript injection attacks within user-generated HTML
We've developed a simple HtmlSantizer and opensourced it here: https://github.com/jitbit/HtmlSanitizer
Usage
var result = HtmlSanitizer.SanitizeHtml(input);
[Disclaimer! I'm one of the authors!]
Here is a 2kb (depends on Snarkdown, which is a 1kb markdown renderer, replace with what you need) vue component that will render escaped markdown, optionally even translating B & I tags for content that may include those tags with formatting...
<template>
<div v-html="html">
</div>
</template>
<script>
import Snarkdown from 'snarkdown'
export default {
props: ['code', 'bandi'],
computed: {
html () {
// Convert b & i tags if flagged...
const unsafe = this.bandi ? this.code
.replace(/<b>/g, '**')
.replace(/<\/b>/g, '**')
.replace(/<i>/g, '*')
.replace(/<\/i>/g, '*') : this.code
// Process the markdown after we escape the html tags...
return Snarkdown(unsafe
.replace(/&/g, '&')
.replace(/</g, '<')
.replace(/>/g, '>')
.replace(/"/g, '"')
.replace(/'/g, ''')
)
}
}
}
</script>
As a comparison, vue-markdown is over 100kb. This won't render math formulas and such, but 99.99% of people won't use it for those things, so not sure why the most popular markdown components are so bloated :(
This is safe to XSS attacks and super fast.
Why did I use '
and not '
? Because: Why shouldn't `'` be used to escape single quotes?
Not sure why this hasn't been mentioned yet... but your browser can sanitize for you.
Here is the 3-line HTML sanitizer that can sanitize 30x faster than any JavaScript variant by using the assembly language version that comes with your browser... This is used in Vue/React/Angular and many other UI frameworks. Note this does NOT escape HTML, it removes it.
const decoder = document.createElement('div')
decoder.innerHTML = YourXSSAttackHere
const sanitized = decoder.textContent
As proof this method is accepted and fast, here is a live link to the decoder used in Vue.js which uses the same pattern: https://github.com/vuejs/vue/blob/dev/src/compiler/parser/entity-decoder.js