How to safe guard our web applications from XSS attacks? One app is vulnearable to attack, if it does not do any conversion of a special charecters.
Just to add to WhiteFang34' list:
It has several whitelists built-in to choose from, such as allowing some HTML, no HTML, etc.
I chose this over Apache Commons's StringEscapeUtils.escapeHtml()
because of how it handles apostrophes. I.e. if our users type in:
Alan's mom had a good brownie recipe.
JSoup will leave the apostrophe alone, whereas Apache Commons would escape that string as:
Alan\'s mom had a good brownie recipe.
Which I wouldn't want to have to worry about unescaping before displaying to the user.
You should HTML escape any input before outputting it back to the user. Some references:
HTML escaping inputs works very well. But in some cases business rules might require you NOT to escape the HTML. Using REGEX is not fit for the task and it is too hard to come up with a good solution using it.
The best solution I found was to use: http://jsoup.org/cookbook/cleaning-html/whitelist-sanitizer
It builds a DOM tree with the provided input and filters any element not previosly allowed by a Whitelist. The API also has other functions for cleaning up html.