There are many Stack Overflow questions (e.g. Whitelisting, preventing XSS with WMD control in C# and WMD Markdown and server-side) about how to do server-side
One possible fix is in wmd.js, in the pushPreviewHtml()
method. Here's the original code from the Stack Overflow version of WMD on GitHub:
if (wmd.panels.preview) {
wmd.panels.preview.innerHTML = text;
}
You can replace it with some scrubbing code. Here's an adaptation of the code that Stack Overflow uses in response to this post, which restricts to a whitelist of tags, and for IMG and A elements, restricts to a whitelist of attributes (and in a specific order too!). See the Meta Stack Overflow post What HTML tags are allowed on Stack Overflow, Server Fault, and Super User? for more info on the whitelist.
Note: this code can certainly be improved, e.g. to allow whitelisted attributes in any order. It also disallows mailto: URLs which is probably a good thing on Internet sites but on your own intranet site it may not be the best approach.
if (wmd.panels.preview) {
// Original WMD code allowed JavaScript injection, like this:
//
// Now, we first ensure elements (and attributes of IMG and A elements) are in a whitelist,
// and if not in whitelist, replace with blanks in preview to prevent XSS attacks
// when editing malicious Markdown.
var okTags = /^(<\/?(b|blockquote|code|del|dd|dl|dt|em|h1|h2|h3|i|kbd|li|ol|p|pre|s|sup|sub|strong|strike|ul)>|<(br|hr)\s?\/?>)$/i;
var okLinks = /^(]+")?\s?>|<\/a>)$/i;
var okImg = /^(]*")?(\stitle="[^"<>]*")?\s?\/?>)$/i;
text = text.replace(/<[^<>]*>?/gi, function (tag) {
return (tag.match(okTags) || tag.match(okLinks) || tag.match(okImg)) ? tag : ""
})
wmd.panels.preview.innerHTML = text; // Original code
}
Also note that this fix is not in the Stack Overflow version of WMD on GitHub-- clearly the change was made later and not checked back into GitHub.
UPDATE: in order to avoid breaking the feature where hyperlinks are auto-created when you type in a URL, you also will need to make changes to showdown.js, like below:
Original code:
var _DoAutoLinks = function(text) {
text = text.replace(/<((https?|ftp|dict):[^'">\s]+)>/gi,"$1");
// Email addresses:
/*
text = text.replace(/
<
(?:mailto:)?
(
[-.\w]+
\@
[-a-z0-9]+(\.[-a-z0-9]+)*\.[a-z]+
)
>
/gi, _DoAutoLinks_callback());
*/
text = text.replace(/<(?:mailto:)?([-.\w]+\@[-a-z0-9]+(\.[-a-z0-9]+)*\.[a-z]+)>/gi,
function(wholeMatch,m1) {
return _EncodeEmailAddress( _UnescapeSpecialChars(m1) );
}
);
return text;
}
Fixed code:
var _DoAutoLinks = function(text) {
// use simplified format for links, to enable whitelisting link attributes
text = text.replace(/(^|\s)(https?|ftp)(:\/\/[-A-Z0-9+&@#\/%?=~_|\[\]\(\)!:,\.;]*[-A-Z0-9+&@#\/%=~_|\[\]])($|\W)/gi, "$1<$2$3>$4");
text = text.replace(/<((https?|ftp):[^'">\s]+)>/gi, '$1');
return text;
}