This is for .NET. IgnoreCase is set and MultiLine is NOT set.
Usually I\'m decent at regex, maybe I\'m running low on caffeine...
Users are allowed to enter
HtmlRuleSanitizer is built on top of the HTML Agility Pack and has a simple syntax for sanitizing tags.
The method HtmlSanitizer.SimpleHtml5Sanitizer()
generates a sanitizer that had everything I need in it, but here's a more dynamic approach:
public static string GetLimitedHtml(string value)
{
var sanitizer = HtmlSanitizer.SimpleHtml5Sanitizer();
var allowed = new string[] {"br", "h1", "h2", "h3", "h4", "h5", "h6", "small", "strike", "strong", "b"};
foreach (var tag in allowed)
{
sanitizer.Tag(tag);
}
return sanitizer.Sanitize(value);
}
I just noticed the current solution allows tags that start with any of the acceptable tags. Thus, if "b" is an acceptable tag, "blink" is too. Not a huge deal, but something to consider if you are strict about how you filter HTML. You certainly wouldn't want to allow "s" as an acceptable tag, as it would allow "script".