How to properly sanitize content with AntiXss Library?

泪湿孤枕 提交于 2019-12-07 03:04:06

问题


I have a simple forums application, when someone posts any content, i do:

post.Content = Sanitizer.GetSafeHtml(post.Content);

Now, i am not sure if i am doing something wrong, or what is going on, but it does not allow almost no html. Even simple <b></b> is too much for it. So i guess that tool is totally useless.

Now my question: Can anyone tell me how should i sanitize my users inputs so that they can post some images(<img> tags) and use bold emphasis etc?


回答1:


It seems that many people find the sanitizer rather useless. Instead of using the sanitizer, just encode everything, and decode safe parts back:

private static readonly Tuple<string, string>[] WhiteList = (new string[]
    {
        "<b>", "</b>", "<i>", "</i>"
    })
    .Select(tag => Tuple.Create(AntiXss.Encoder.HtmlEncode(tag), tag))
    .ToArray();

public static string Sanitize(string html)
{
    var safeHtml = new StringBuilder(AntiXss.Encoder.HtmlEncode(html));

    for (int index = 0; index < WhiteList.Length; index++)
    {
        string encodedTag = WhiteList[index].Item1;
        string decodedTag = WhiteList[index].Item2;
        safeHtml.Replace(encodedTag,decodedTag);
    }

    return safeHtml.ToString();
}

Please note that it's nearly impossible to safely decode an IMG tag, since there are really simple ways for an attacker to abuse this tag. Examples:

<IMG SRC="javascript:alert('XSS');">

<IMG SRC=&#106;&#97;&#118;&#97;&#115;&#99;&#114;&#105;&#112;&#116;&#58;&#97;&#108;&#101;&#114;&#116;&#40;&#39;&#88;&#83;&#83;&#39;&#41;>

Take a look here for more a thorough XSS Cheat Sheet




回答2:


This post best describes the issues with the Anti XSS library and provides a good work around that whitelists a set of tags and attributes.

I'm using this solution in my project and it seems to work great.




回答3:


There is a quite simple way to block the threat by just getting rid of the "dangerous" tags.

string SanitizeHtml(string html)
{
        html = System.Web.HttpUtility.HtmlDecode(html);

        List<string> blackListedTags = new List<string>() 
        {
                "body", "script", "iframe", "form", "object", "embed", "link", "head", "meta" 
        };

        foreach (string tag in blackListedTags) { 
            html = Regex.Replace(html, "<" + tag, "<p", RegexOptions.IgnoreCase); 
            html = Regex.Replace(html, "</" + tag, "</p", RegexOptions.IgnoreCase);
        }

        return html;
}

With this the user will still see what is within the dangerous script, but it won't harm anything.



来源:https://stackoverflow.com/questions/12554194/how-to-properly-sanitize-content-with-antixss-library

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!