Is there a good way to remove HTML from a Java string? A simple regex like
replaceAll("\\\\<.*?>", &quo
You might want to replace
and
The only way I can think of removing HTML tags but leaving non-HTML between angle brackets would be check against a list of HTML tags. Something along these lines...
replaceAll("\\<[\s]*tag[^>]*>","")
Then HTML-decode special characters such as &
. The result should not be considered to be sanitized.