Basically I would like to decode a given Html document, and replace all special chars, such as \" \"
-> \" \"
, \">\"
-
Consider using the HtmlManipulator Java class. You may need to add some items (not all entities are in the list).
The Apache Commons StringEscapeUtils as suggested by Kevin Hakanson did not work 100% for me; several entities like (left single quote) were translated into '222' somehow. I also tried org.jsoup, and had the same problem.