Remove HTML tags from a String

后端 未结 30 3235
误落风尘
误落风尘 2020-11-21 07:35

Is there a good way to remove HTML from a Java string? A simple regex like

replaceAll("\\\\<.*?>", &quo         


        
30条回答
  •  星月不相逢
    2020-11-21 07:44

    Also very simple using Jericho, and you can retain some of the formatting (line breaks and links, for example).

        Source htmlSource = new Source(htmlText);
        Segment htmlSeg = new Segment(htmlSource, 0, htmlSource.length());
        Renderer htmlRend = new Renderer(htmlSeg);
        System.out.println(htmlRend.toString());
    

提交回复
热议问题