Remove HTML tags from a String

后端 未结 30 3100
误落风尘
误落风尘 2020-11-21 07:35

Is there a good way to remove HTML from a Java string? A simple regex like

replaceAll("\\\\<.*?>", &quo         


        
30条回答
  •  灰色年华
    2020-11-21 08:03

    It sounds like you want to go from HTML to plain text.
    If that is the case look at www.htmlparser.org. Here is an example that strips all the tags out from the html file found at a URL.
    It makes use of org.htmlparser.beans.StringBean.

    static public String getUrlContentsAsText(String url) {
        String content = "";
        StringBean stringBean = new StringBean();
        stringBean.setURL(url);
        content = stringBean.getStrings();
        return content;
    }
    

提交回复
热议问题