Is it possible to convert HTML into XHTML with Jsoup 1.8.1?

前端 未结 3 1336
臣服心动
臣服心动 2020-12-15 05:19
String body = \"
\"; Document document = Jsoup.parseBodyFragment(body); document.outputSettings().escapeMode(EscapeMode.xhtml); String str = document.body()
相关标签:
3条回答
  • 2020-12-15 05:36

    You should tell that syntax you want to leave the string in HTML or XML.

    public String parserXHtml(String html) {
            org.jsoup.nodes.Document document = Jsoup.parseBodyFragment(html);
            document.outputSettings().syntax(org.jsoup.nodes.Document.OutputSettings.Syntax.xml); //This will ensure the validity
            document.outputSettings().charset("UTF-8");
            return document.toString();
        }
    
    0 讨论(0)
  • 2020-12-15 05:44

    See Document.OutputSettings.Syntax.xml:

    private String toXHTML( String html ) {
        final Document document = Jsoup.parse(html);
        document.outputSettings().syntax(Document.OutputSettings.Syntax.xml);    
        return document.html();
    }
    
    0 讨论(0)
  • 2020-12-15 05:56

    You can use JTidy API to do this. Use jtidy-r938.jar

    You can use the following method to get xhtml from html

    public static String getXHTMLFromHTML(String inputFile,
                String outputFile) throws Exception {
    
            File file = new File(inputFile);
            FileOutputStream fos = null;
            InputStream is = null;
            try {
                fos = new FileOutputStream(outputFile);
                is = new FileInputStream(file);
                Tidy tidy = new Tidy(); 
                tidy.setXHTML(true); 
                tidy.parse(is, fos);
            } catch (FileNotFoundException e) {
                e.printStackTrace();
            }finally{
                if(fos != null){
                    try {
                        fos.close();
                    } catch (IOException e) {
                        fos = null;
                    }
                    fos = null;
                }
                if(is != null){
                    try {
                        is.close();
                    } catch (IOException e) {
                        is = null;
                    }
                    is = null;
                }
            }
    
            return outputFile;
        }
    
    0 讨论(0)
提交回复
热议问题