How to unescape HTML character entities in Java?

前端 未结 11 1747
耶瑟儿~
耶瑟儿~ 2020-11-21 22:38

Basically I would like to decode a given Html document, and replace all special chars, such as \" \" -> \" \", \">\" -

11条回答
  •  粉色の甜心
    2020-11-21 23:17

    The most reliable way is with

    String cleanedString = StringEscapeUtils.unescapeHtml4(originalString);
    

    from org.apache.commons.lang3.StringEscapeUtils.

    And to escape the whitespaces

    cleanedString = cleanedString.trim();
    

    This will ensure that whitespaces due to copy and paste in web forms to not get persisted in DB.

提交回复
热议问题