Forcing MSXML to format XML output with indents and newlines

南楼画角 提交于 2019-11-30 15:41:37

For such tiny files as a config the overhead of using XSL probably isn't significant anyway. The power of SAX is more important when you're dealing with large files or tons of small ones such as the server side of a Web Service - and there you probably should not be using the heavyweight DOM in the first place.

Private Sub FormatDocToFile(ByVal Doc As MSXML2.DOMDocument, _
                            ByVal FileName As String)
    'Reformats the DOMDocument "Doc" into an ADODB.Stream
    'and writes it to the specified file.
    '
    'Note the UTF-8 output never gets a BOM.  If we want one we
    'have to write it here explicitly after opening the Stream.
    Dim rdrDom As MSXML2.SAXXMLReader
    Dim stmFormatted As ADODB.Stream
    Dim wtrFormatted As MSXML2.MXXMLWriter

    Set stmFormatted = New ADODB.Stream
    With stmFormatted
        .Open
        .Type = adTypeBinary
        Set wtrFormatted = New MSXML2.MXXMLWriter
        With wtrFormatted
            .omitXMLDeclaration = False
            .standalone = True
            .byteOrderMark = False 'If not set (even to False) then
                                   '.encoding is ignored.
            .encoding = "utf-8"    'Even if .byteOrderMark = True
                                   'UTF-8 never gets a BOM.
            .indent = True
            .output = stmFormatted
            Set rdrDom = New MSXML2.SAXXMLReader
            With rdrDom
                Set .contentHandler = wtrFormatted
                Set .dtdHandler = wtrFormatted
                Set .errorHandler = wtrFormatted
                .putProperty "http://xml.org/sax/properties/lexical-handler", _
                             wtrFormatted
                .putProperty "http://xml.org/sax/properties/declaration-handler", _
                             wtrFormatted
                .parse Doc
            End With
        End With
        .SaveToFile FileName
        .Close
    End With
End Sub

Probably this answer will not help in your specific case, but in general it may be of use. It regards cases when the document is loaded and saved without much modification. DomDocument has preserveWhitespace property, which is initially set to False. If you set it to True before load, then it will be saved using the same indentation as the original file.

To add the indentation manually one may create text nodes and insert them to create new lines and spaces between elements, like this:

Set txt = doc.createTextNode(vbCrLf & "  ")
Call node.parentNode.insertBefore(txt, node)
Lumi

You could take a look at this other question on SO and the C++ code of the answers. But it's too much work. You're saying you're just storing a config file. So use an XSLT transformation:

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
  <xsl:strip-space elements="*"/>
  <xsl:output indent="yes"/>
  <xsl:template match="@*|node()">
    <xsl:copy>
      <xsl:apply-templates select="@*|node()"/>
    </xsl:copy>
  </xsl:template>
</xsl:stylesheet>

Remember to output to an ADODB.Stream, not to a DOM. If you output to a DOM, the XSLT serializer will be ignored.

Here is a shorter indentation utility function that works on DOM objects and strings as input and outputs a formatted string. File handling (utf-8) is left outside its scope. Does not use ADODB streams and does not need MSXML in project references.

Public Function FormatXmlIndent(vDomOrString As Variant, sResult As String) As Boolean
    Dim oWriter         As Object ' MSXML2.MXXMLWriter

    On Error GoTo QH
    Set oWriter = CreateObject("MSXML2.MXXMLWriter")
    oWriter.omitXMLDeclaration = True
    oWriter.indent = True
    With CreateObject("MSXML2.SAXXMLReader")
        Set .contentHandler = oWriter
        '--- keep CDATA elements
        .putProperty "http://xml.org/sax/properties/lexical-handler", oWriter 
        .parse vDomOrString
    End With
    sResult = oWriter.output
    '--- success
    FormatXmlIndent = True
    Exit Function
QH:
End Function

Can be used like this

    sXml = ReadTextFile("doc.xml")
    FormatXmlIndent sXml, sXml

... so if anything fails (invalid XML, etc.) sXml still holds original unformatted input.

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!