How to pretty print XML from Java?

后端 未结 30 2472
慢半拍i
慢半拍i 2020-11-22 01:55

I have a Java String that contains XML, with no line feeds or indentations. I would like to turn it into a String with nicely formatted XML. How do I do this?



        
相关标签:
30条回答
  • 2020-11-22 02:32

    Just for future reference, here's a solution that worked for me (thanks to a comment that @George Hawkins posted in one of the answers):

    DOMImplementationRegistry registry = DOMImplementationRegistry.newInstance();
    DOMImplementationLS impl = (DOMImplementationLS) registry.getDOMImplementation("LS");
    LSSerializer writer = impl.createLSSerializer();
    writer.getDomConfig().setParameter("format-pretty-print", Boolean.TRUE);
    LSOutput output = impl.createLSOutput();
    ByteArrayOutputStream out = new ByteArrayOutputStream();
    output.setByteStream(out);
    writer.write(document, output);
    String xmlStr = new String(out.toByteArray());
    
    0 讨论(0)
  • 2020-11-22 02:35

    If you're sure that you have a valid XML, this one is simple, and avoids XML DOM trees. Maybe has some bugs, do comment if you see anything

    public String prettyPrint(String xml) {
                if (xml == null || xml.trim().length() == 0) return "";
    
                int stack = 0;
                StringBuilder pretty = new StringBuilder();
                String[] rows = xml.trim().replaceAll(">", ">\n").replaceAll("<", "\n<").split("\n");
    
                for (int i = 0; i < rows.length; i++) {
                        if (rows[i] == null || rows[i].trim().length() == 0) continue;
    
                        String row = rows[i].trim();
                        if (row.startsWith("<?")) {
                                // xml version tag
                                pretty.append(row + "\n");
                        } else if (row.startsWith("</")) {
                                // closing tag
                                String indent = repeatString("    ", --stack);
                                pretty.append(indent + row + "\n");
                        } else if (row.startsWith("<")) {
                                // starting tag
                                String indent = repeatString("    ", stack++);
                                pretty.append(indent + row + "\n");
                        } else {
                                // tag data
                                String indent = repeatString("    ", stack);
                                pretty.append(indent + row + "\n");
                        }
                }
    
                return pretty.toString().trim();
        }
    
    0 讨论(0)
  • 2020-11-22 02:39

    I mix all of them and writing one small program. It is reading from the xml file and printing out. Just Instead of xzy give your file path.

        public static void main(String[] args) throws Exception {
        DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
        dbf.setValidating(false);
        DocumentBuilder db = dbf.newDocumentBuilder();
        Document doc = db.parse(new FileInputStream(new File("C:/Users/xyz.xml")));
        prettyPrint(doc);
    
    }
    
    private static String prettyPrint(Document document)
            throws TransformerException {
        TransformerFactory transformerFactory = TransformerFactory
                .newInstance();
        Transformer transformer = transformerFactory.newTransformer();
        transformer.setOutputProperty(OutputKeys.INDENT, "yes");
        transformer.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "2");
        transformer.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
        transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "no");
        DOMSource source = new DOMSource(document);
        StringWriter strWriter = new StringWriter();
        StreamResult result = new StreamResult(strWriter);transformer.transform(source, result);
        System.out.println(strWriter.getBuffer().toString());
    
        return strWriter.getBuffer().toString();
    
    }
    
    0 讨论(0)
  • 2020-11-22 02:42

    I had the same problem and I'm having great success with JTidy (http://jtidy.sourceforge.net/index.html)

    Example:

    Tidy t = new Tidy();
    t.setIndentContent(true);
    Document d = t.parseDOM(
        new ByteArrayInputStream("HTML goes here", null);
    
    OutputStream out = new ByteArrayOutputStream();
    t.pprint(d, out);
    String html = out.toString();
    
    0 讨论(0)
  • 2020-11-22 02:42

    For those searching for a quick and dirty solution - which doesn't need the XML to be 100% valid. e.g. in case of REST / SOAP logging (you never know what the others send ;-))

    I found and advanced a code snipped I found online which I think is still missing here as a valid possible approach:

    public static String prettyPrintXMLAsString(String xmlString) {
        /* Remove new lines */
        final String LINE_BREAK = "\n";
        xmlString = xmlString.replaceAll(LINE_BREAK, "");
        StringBuffer prettyPrintXml = new StringBuffer();
        /* Group the xml tags */
        Pattern pattern = Pattern.compile("(<[^/][^>]+>)?([^<]*)(</[^>]+>)?(<[^/][^>]+/>)?");
        Matcher matcher = pattern.matcher(xmlString);
        int tabCount = 0;
        while (matcher.find()) {
            String str1 = (null == matcher.group(1) || "null".equals(matcher.group())) ? "" : matcher.group(1);
            String str2 = (null == matcher.group(2) || "null".equals(matcher.group())) ? "" : matcher.group(2);
            String str3 = (null == matcher.group(3) || "null".equals(matcher.group())) ? "" : matcher.group(3);
            String str4 = (null == matcher.group(4) || "null".equals(matcher.group())) ? "" : matcher.group(4);
    
            if (matcher.group() != null && !matcher.group().trim().equals("")) {
                printTabs(tabCount, prettyPrintXml);
                if (!str1.equals("") && str3.equals("")) {
                    ++tabCount;
                }
                if (str1.equals("") && !str3.equals("")) {
                    --tabCount;
                    prettyPrintXml.deleteCharAt(prettyPrintXml.length() - 1);
                }
    
                prettyPrintXml.append(str1);
                prettyPrintXml.append(str2);
                prettyPrintXml.append(str3);
                if (!str4.equals("")) {
                    prettyPrintXml.append(LINE_BREAK);
                    printTabs(tabCount, prettyPrintXml);
                    prettyPrintXml.append(str4);
                }
                prettyPrintXml.append(LINE_BREAK);
            }
        }
        return prettyPrintXml.toString();
    }
    
    private static void printTabs(int count, StringBuffer stringBuffer) {
        for (int i = 0; i < count; i++) {
            stringBuffer.append("\t");
        }
    }
    
    public static void main(String[] args) {
        String x = new String(
                "<soap:Envelope xmlns:soap=\"http://schemas.xmlsoap.org/soap/envelope/\"><soap:Body><soap:Fault><faultcode>soap:Client</faultcode><faultstring>INVALID_MESSAGE</faultstring><detail><ns3:XcbSoapFault xmlns=\"\" xmlns:ns3=\"http://www.someapp.eu/xcb/types/xcb/v1\"><CauseCode>20007</CauseCode><CauseText>INVALID_MESSAGE</CauseText><DebugInfo>Problems creating SAAJ object model</DebugInfo></ns3:XcbSoapFault></detail></soap:Fault></soap:Body></soap:Envelope>");
        System.out.println(prettyPrintXMLAsString(x));
    }
    

    here is the output:

    <soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/">
      <soap:Body>
        <soap:Fault>
            <faultcode>soap:Client</faultcode>
            <faultstring>INVALID_MESSAGE</faultstring>
            <detail>
                <ns3:XcbSoapFault xmlns="" xmlns:ns3="http://www.someapp.eu/xcb/types/xcb/v1">
                    <CauseCode>20007</CauseCode>
                    <CauseText>INVALID_MESSAGE</CauseText>
                    <DebugInfo>Problems creating SAAJ object model</DebugInfo>
                </ns3:XcbSoapFault>
            </detail>
        </soap:Fault>
      </soap:Body>
    </soap:Envelope>
    
    0 讨论(0)
  • 2020-11-22 02:43

    Since you are starting with a String, you need to covert to a DOM object (e.g. Node) before you can use the Transformer. However, if you know your XML string is valid, and you don't want to incur the memory overhead of parsing a string into a DOM, then running a transform over the DOM to get a string back - you could just do some old fashioned character by character parsing. Insert a newline and spaces after every </...> characters, keep and indent counter (to determine the number of spaces) that you increment for every <...> and decrement for every </...> you see.

    Disclaimer - I did a cut/paste/text edit of the functions below, so they may not compile as is.

    public static final Element createDOM(String strXML) 
        throws ParserConfigurationException, SAXException, IOException {
    
        DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
        dbf.setValidating(true);
        DocumentBuilder db = dbf.newDocumentBuilder();
        InputSource sourceXML = new InputSource(new StringReader(strXML));
        Document xmlDoc = db.parse(sourceXML);
        Element e = xmlDoc.getDocumentElement();
        e.normalize();
        return e;
    }
    
    public static final void prettyPrint(Node xml, OutputStream out)
        throws TransformerConfigurationException, TransformerFactoryConfigurationError, TransformerException {
        Transformer tf = TransformerFactory.newInstance().newTransformer();
        tf.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
        tf.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
        tf.setOutputProperty(OutputKeys.INDENT, "yes");
        tf.transform(new DOMSource(xml), new StreamResult(out));
    }
    
    0 讨论(0)
提交回复
热议问题