How to pretty print XML from Java?

后端 未结 30 2474
慢半拍i
慢半拍i 2020-11-22 01:55

I have a Java String that contains XML, with no line feeds or indentations. I would like to turn it into a String with nicely formatted XML. How do I do this?



        
相关标签:
30条回答
  • 2020-11-22 02:48

    Here's a way of doing it using dom4j:

    Imports:

    import org.dom4j.Document;  
    import org.dom4j.DocumentHelper;  
    import org.dom4j.io.OutputFormat;  
    import org.dom4j.io.XMLWriter;
    

    Code:

    String xml = "<your xml='here'/>";  
    Document doc = DocumentHelper.parseText(xml);  
    StringWriter sw = new StringWriter();  
    OutputFormat format = OutputFormat.createPrettyPrint();  
    XMLWriter xw = new XMLWriter(sw, format);  
    xw.write(doc);  
    String result = sw.toString();
    
    0 讨论(0)
  • 2020-11-22 02:48

    I have found that in Java 1.6.0_32 the normal method to pretty print an XML string (using a Transformer with a null or identity xslt) does not behave as I would like if tags are merely separated by whitespace, as opposed to having no separating text. I tried using <xsl:strip-space elements="*"/> in my template to no avail. The simplest solution I found was to strip the space the way I wanted using a SAXSource and XML filter. Since my solution was for logging I also extended this to work with incomplete XML fragments. Note the normal method seems to work fine if you use a DOMSource but I did not want to use this because of the incompleteness and memory overhead.

    public static class WhitespaceIgnoreFilter extends XMLFilterImpl
    {
    
        @Override
        public void ignorableWhitespace(char[] arg0,
                                        int arg1,
                                        int arg2) throws SAXException
        {
            //Ignore it then...
        }
    
        @Override
        public void characters( char[] ch,
                                int start,
                                int length) throws SAXException
        {
            if (!new String(ch, start, length).trim().equals("")) 
                   super.characters(ch, start, length); 
        }
    }
    
    public static String prettyXML(String logMsg, boolean allowBadlyFormedFragments) throws SAXException, IOException, TransformerException
        {
            TransformerFactory transFactory = TransformerFactory.newInstance();
            transFactory.setAttribute("indent-number", new Integer(2));
            Transformer transformer = transFactory.newTransformer();
            transformer.setOutputProperty(OutputKeys.INDENT, "yes");
            transformer.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "4");
            StringWriter out = new StringWriter();
            XMLReader masterParser = SAXHelper.getSAXParser(true);
            XMLFilter parser = new WhitespaceIgnoreFilter();
            parser.setParent(masterParser);
    
            if(allowBadlyFormedFragments)
            {
                transformer.setErrorListener(new ErrorListener()
                {
                    @Override
                    public void warning(TransformerException exception) throws TransformerException
                    {
                    }
    
                    @Override
                    public void fatalError(TransformerException exception) throws TransformerException
                    {
                    }
    
                    @Override
                    public void error(TransformerException exception) throws TransformerException
                    {
                    }
                });
            }
    
            try
            {
                transformer.transform(new SAXSource(parser, new InputSource(new StringReader(logMsg))), new StreamResult(out));
            }
            catch (TransformerException e)
            {
                if(e.getCause() != null && e.getCause() instanceof SAXParseException)
                {
                    if(!allowBadlyFormedFragments || !"XML document structures must start and end within the same entity.".equals(e.getCause().getMessage()))
                    {
                        throw e;
                    }
                }
                else
                {
                    throw e;
                }
            }
            out.flush();
            return out.toString();
        }
    
    0 讨论(0)
  • 2020-11-22 02:50

    Now it's 2012 and Java can do more than it used to with XML, I'd like to add an alternative to my accepted answer. This has no dependencies outside of Java 6.

    import org.w3c.dom.Node;
    import org.w3c.dom.bootstrap.DOMImplementationRegistry;
    import org.w3c.dom.ls.DOMImplementationLS;
    import org.w3c.dom.ls.LSSerializer;
    import org.xml.sax.InputSource;
    
    import javax.xml.parsers.DocumentBuilderFactory;
    import java.io.StringReader;
    
    /**
     * Pretty-prints xml, supplied as a string.
     * <p/>
     * eg.
     * <code>
     * String formattedXml = new XmlFormatter().format("<tag><nested>hello</nested></tag>");
     * </code>
     */
    public class XmlFormatter {
    
        public String format(String xml) {
    
            try {
                final InputSource src = new InputSource(new StringReader(xml));
                final Node document = DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(src).getDocumentElement();
                final Boolean keepDeclaration = Boolean.valueOf(xml.startsWith("<?xml"));
    
            //May need this: System.setProperty(DOMImplementationRegistry.PROPERTY,"com.sun.org.apache.xerces.internal.dom.DOMImplementationSourceImpl");
    
    
                final DOMImplementationRegistry registry = DOMImplementationRegistry.newInstance();
                final DOMImplementationLS impl = (DOMImplementationLS) registry.getDOMImplementation("LS");
                final LSSerializer writer = impl.createLSSerializer();
    
                writer.getDomConfig().setParameter("format-pretty-print", Boolean.TRUE); // Set this to true if the output needs to be beautified.
                writer.getDomConfig().setParameter("xml-declaration", keepDeclaration); // Set this to true if the declaration is needed to be outputted.
    
                return writer.writeToString(document);
            } catch (Exception e) {
                throw new RuntimeException(e);
            }
        }
    
        public static void main(String[] args) {
            String unformattedXml =
                    "<?xml version=\"1.0\" encoding=\"UTF-8\"?><QueryMessage\n" +
                            "        xmlns=\"http://www.SDMX.org/resources/SDMXML/schemas/v2_0/message\"\n" +
                            "        xmlns:query=\"http://www.SDMX.org/resources/SDMXML/schemas/v2_0/query\">\n" +
                            "    <Query>\n" +
                            "        <query:CategorySchemeWhere>\n" +
                            "   \t\t\t\t\t         <query:AgencyID>ECB\n\n\n\n</query:AgencyID>\n" +
                            "        </query:CategorySchemeWhere>\n" +
                            "    </Query>\n\n\n\n\n" +
                            "</QueryMessage>";
    
            System.out.println(new XmlFormatter().format(unformattedXml));
        }
    }
    
    0 讨论(0)
  • 2020-11-22 02:50

    Just another solution which works for us

    import java.io.StringWriter;
    import org.dom4j.DocumentHelper;
    import org.dom4j.io.OutputFormat;
    import org.dom4j.io.XMLWriter;
    
    **
     * Pretty Print XML String
     * 
     * @param inputXmlString
     * @return
     */
    public static String prettyPrintXml(String xml) {
    
        final StringWriter sw;
    
        try {
            final OutputFormat format = OutputFormat.createPrettyPrint();
            final org.dom4j.Document document = DocumentHelper.parseText(xml);
            sw = new StringWriter();
            final XMLWriter writer = new XMLWriter(sw, format);
            writer.write(document);
        }
        catch (Exception e) {
            throw new RuntimeException("Error pretty printing xml:\n" + xml, e);
        }
        return sw.toString();
    }
    
    0 讨论(0)
  • 2020-11-22 02:50

    Using jdom2 : http://www.jdom.org/

    import java.io.StringReader;
    import org.jdom2.input.SAXBuilder;
    import org.jdom2.output.Format;
    import org.jdom2.output.XMLOutputter;
    
    String prettyXml = new XMLOutputter(Format.getPrettyFormat()).
                             outputString(new SAXBuilder().build(new StringReader(uglyXml)));
    
    0 讨论(0)
  • 2020-11-22 02:51

    Using scala:

    import xml._
    val xml = XML.loadString("<tag><nested>hello</nested></tag>")
    val formatted = new PrettyPrinter(150, 2).format(xml)
    println(formatted)
    

    You can do this in Java too, if you depend on the scala-library.jar. It looks like this:

    import scala.xml.*;
    
    public class FormatXML {
        public static void main(String[] args) {
            String unformattedXml = "<tag><nested>hello</nested></tag>";
            PrettyPrinter pp = new PrettyPrinter(150, 3);
            String formatted = pp.format(XML.loadString(unformattedXml), TopScope$.MODULE$);
            System.out.println(formatted);
        }
    }
    

    The PrettyPrinter object is constructed with two ints, the first being max line length and the second being the indentation step.

    0 讨论(0)
提交回复
热议问题