问题
I need JSoup to remove scripts from some HTML string, and using this snippet for that:
Document unsafeDoc = Jsoup.parse(unsafeHtml);
Document safeDoc = cleaner.clean(unsafeDoc);
OutputSettings o = safeDoc.outputSettings();
o.escapeMode(EscapeMode.xhtml);
return safeDoc.select("body").html();
But it is inserting extra space before <br>
tags, converting " and ' to "
; and '
etc., which I don't want. Could not find a way to achieve this. Would appreciate any help or recommendations of any other library than JSoup doing this.
Thanks, Sanjay
回答1:
Try using:
safeDoc.outputSettings().prettyPrint(false);
I had the same problem and that fixed it.
来源:https://stackoverflow.com/questions/11288324/how-to-prevent-jsoup-cleaner-tampering-the-content