We\'re using Jsoup.clean(String, Whitelist) to process some input, and it appears that Jsoup is adding an extraneous line break just prior to acceptable tags. I\'ve seen a f
I just downloaded Jsoup 1.7.1, in this version it's possible to use clean()
-method with custom OutputSettings
:
String html = "This is a line with <b>bold text</b> within it.";
OutputSettings settings = new OutputSettings();
settings.prettyPrint(false);
String clean = Jsoup.clean(html, "", Whitelist.relaxed(), settings);
Or shorter:
String clean = Jsoup.clean(html, "", Whitelist.relaxed(), new OutputSettings().prettyPrint(false));
(In fact its the same solution like posted in the comments)
Hmm... have not seen any options for this.
If you parse the html in Document
you have some output settings:
Document doc = Jsoup.parseBodyFragment(htmlToClean);
doc.outputSettings().prettyPrint(false);
System.out.println(doc.body().html());
With prettyPrint
off you'll get the following output: This is a line with <b>bold text</b> within it.
Maybe you can write your own clean()
method, since the implemented one useses Document
's (there' you can disable prettyPrint
):
Orginal methods:
public static String clean(String bodyHtml, Whitelist whitelist) {
return clean(bodyHtml, "", whitelist);
}
public static String clean(String bodyHtml, String baseUri, Whitelist whitelist) {
Document dirty = parseBodyFragment(bodyHtml, baseUri);
Cleaner cleaner = new Cleaner(whitelist);
Document clean = cleaner.clean(dirty);
return clean.body().html();
}