How do I filter all HTML tags except a certain whitelist?

后端 未结 8 1525
一整个雨季
一整个雨季 2020-11-27 12:07

This is for .NET. IgnoreCase is set and MultiLine is NOT set.

Usually I\'m decent at regex, maybe I\'m running low on caffeine...

Users are allowed to enter

相关标签:
8条回答
  • 2020-11-27 12:46

    HtmlRuleSanitizer is built on top of the HTML Agility Pack and has a simple syntax for sanitizing tags.

    The method HtmlSanitizer.SimpleHtml5Sanitizer() generates a sanitizer that had everything I need in it, but here's a more dynamic approach:

    public static string GetLimitedHtml(string value)
    {
        var sanitizer = HtmlSanitizer.SimpleHtml5Sanitizer();
        var allowed = new string[] {"br", "h1", "h2", "h3", "h4", "h5", "h6", "small", "strike", "strong", "b"};
        foreach (var tag in allowed)
        {
            sanitizer.Tag(tag);
        }
        
        return sanitizer.Sanitize(value);
    }
    
    0 讨论(0)
  • 2020-11-27 12:51

    I just noticed the current solution allows tags that start with any of the acceptable tags. Thus, if "b" is an acceptable tag, "blink" is too. Not a huge deal, but something to consider if you are strict about how you filter HTML. You certainly wouldn't want to allow "s" as an acceptable tag, as it would allow "script".

    0 讨论(0)
提交回复
热议问题