currently I use org.apache.commons.lang.StringEscapeUtils escapeHtml()
to escape unwanted HTML tags in my Strings but then I realized it escapes characters with
I know is too late to adding my comment, but perhaps the following code will be helpful:
public static String escapeHtml(String string) {
StringBuilder escapedTxt = new StringBuilder();
for (int i = 0; i < string.length(); i++) {
char tmp = string.charAt(i);
switch (tmp) {
case '<':
escapedTxt.append("<");
break;
case '>':
escapedTxt.append(">");
break;
case '&':
escapedTxt.append("&");
break;
case '"':
escapedTxt.append(""");
break;
case '\'':
escapedTxt.append("'");
break;
case '/':
escapedTxt.append("/");
break;
default:
escapedTxt.append(tmp);
}
}
return escapedTxt.toString();
}
enjoy!
If you're using Wicket, use:
import org.apache.wicket.util.string.Strings;
...
CharSequence cs = Strings.escapeMarkup(src);
String str = Strings.escapeMarkup(src).toString();
If it's for Android, use TextUtils.htmlEncode(String)
instead.
Here's a version that replaces the six significant characters as recommended by OWASP. This is suitable for HTML content elements like <textarea>...</textarea>
, but not HTML attributes like <input value="...">
because the latter are often left unquoted.
StringUtils.replaceEach(text,
new String[]{"&", "<", ">", "\"", "'", "/"},
new String[]{"&", "<", ">", """, "'", "/"});
StringUtils.replaceEach(str, new String[]{"&", "\"", "<", ">"}, new String[]{"&", """, "<", ">"})
This looks very good to me:
org/apache/commons/lang3/StringEscapeUtils.html#escapeXml(java.lang.String)
By asking XML, you will get XHTML, which is good HTML.