问题
Are there any problems with what I am doing here? This is my first time to deal with something like this, and I just want to make sure I understand all the risks, etc. to different methods.
I am using WMD to get user input, and I am displaying it with a literal control. Since it is uneditable once entered I will be storing the HTML and not the Markdown,
input = Server.HTMLEncode(stringThatComesFromWMDTextArea)
And then run something like the following for tags I want users to be able to use.
// Unescape whitelisted tags.
string output = input.Replace("<b>", "<b>").Replace("</b>", "</b>")
.Replace("<i>", "<i>").Replace("</i>", "</i>");
Edit Here is what I am doing currently:
public static string EncodeAndWhitelist(string html)
{
string[] whiteList = { "b", "i", "strong", "img", "ul", "li" };
string encodedHTML = HttpUtility.HtmlEncode(html);
foreach (string wl in whiteList)
encodedHTML = encodedHTML.Replace("<" + wl + ">", "<" + wl + ">").Replace("</" + wl + ">", "</" + wl + ">");
return encodedHTML;
}
- Will what I am doing here keep me protected from XSS?
- Are there any other considerations that should be made?
- Is there a good list of normal tags to whitelist?
回答1:
If your requirements really are that basic that you can do such simple string replacements then yes, this is ‘safe’ against XSS. (However, it's still possible to submit non-well-formed content where <i>
and <b>
are mis-nested or unclosed, which could potentially mess up the page the content ends up inserted into.)
But this is rarely enough. For example currently <a href="...">
or <img src="..." />
are not allowed. If you wanted to allow these or other markup with attribute values in, you'd have a whole lot more work to do. You might then approach it with regex, but that gives you endless problems with accidental nesting and replacement of already-replaced content, seeing as how regex can't parse HTML, and that.
To solve both problems, the usual approach is to use an [X][HT]ML parser on the input, then walk the DOM removing all but known-good elements and attributes, then finally re-serialise to [X]HTML. The result is then guaranteed well-formed and contains only safe content.
来源:https://stackoverflow.com/questions/2104520/whitelisting-preventing-xss-with-wmd-control-in-c-sharp