jsoup line feed

前端 未结 2 546
闹比i
闹比i 2020-12-06 16:58

We\'re using Jsoup.clean(String, Whitelist) to process some input, and it appears that Jsoup is adding an extraneous line break just prior to acceptable tags. I\'ve seen a f

相关标签:
2条回答
  • 2020-12-06 17:51

    Addendum:

    I just downloaded Jsoup 1.7.1, in this version it's possible to use clean()-method with custom OutputSettings:

    String html = "This is a line with <b>bold text</b> within it.";
    
    OutputSettings settings = new OutputSettings();
    settings.prettyPrint(false);
    
    String clean = Jsoup.clean(html, "", Whitelist.relaxed(), settings);
    

    Or shorter:

    String clean = Jsoup.clean(html, "", Whitelist.relaxed(), new OutputSettings().prettyPrint(false));
    

    (In fact its the same solution like posted in the comments)

    0 讨论(0)
  • 2020-12-06 17:57

    Hmm... have not seen any options for this.

    If you parse the html in Document you have some output settings:

    Document doc = Jsoup.parseBodyFragment(htmlToClean);
    doc.outputSettings().prettyPrint(false);
    
    System.out.println(doc.body().html());
    

    With prettyPrint off you'll get the following output: This is a line with <b>bold text</b> within it.

    Maybe you can write your own clean() method, since the implemented one useses Document's (there' you can disable prettyPrint):

    Orginal methods:

    public static String clean(String bodyHtml, Whitelist whitelist) {
        return clean(bodyHtml, "", whitelist);
    }
    
    public static String clean(String bodyHtml, String baseUri, Whitelist whitelist) {
        Document dirty = parseBodyFragment(bodyHtml, baseUri);
        Cleaner cleaner = new Cleaner(whitelist);
        Document clean = cleaner.clean(dirty);
        return clean.body().html();
    }
    
    0 讨论(0)
提交回复
热议问题