Basically I would like to decode a given Html document, and replace all special chars, such as \" \"
-> \" \"
, \">\"
-
The most reliable way is with
String cleanedString = StringEscapeUtils.unescapeHtml4(originalString);
from org.apache.commons.lang3.StringEscapeUtils
.
And to escape the whitespaces
cleanedString = cleanedString.trim();
This will ensure that whitespaces due to copy and paste in web forms to not get persisted in DB.