How to detect and remove empty XML tags?

天大地大妈咪最大 提交于 2021-01-08 06:44:53

问题


I've got a bunch of XML files, I wish to detect and remove empty tags inside them. like:

<My></My>
<Your/>

<sometags>
    <his>
    </his>
    <hasContent>sdfaf</hasContent>
</sometags>

They're all kinds of empty tags (My, Your, his) I wish to remove. Does PowerShell support such kind of empty tag detection, no matter how deep they're embedded inside other tags?


回答1:


function Format-XML
{ 
    param (
        [parameter(Mandatory = $true)][xml] $xml, 
        [parameter(Mandatory = $false)][int] $indent = 4
    ) 

    try
    {
        $Error.Clear()

        $StringWriter = New-Object System.IO.StringWriter 
        $XmlWriter = New-Object System.XMl.XmlTextWriter $StringWriter 
        $xmlWriter.Formatting = "indented" 
        $xmlWriter.Indentation = $indent 
        $xml.WriteContentTo($XmlWriter) 
        $XmlWriter.Flush() 
        $StringWriter.Flush() 

        return $StringWriter.ToString() 
    }

    catch
    {
        Write-Host "$($MyInvocation.InvocationName): $_"; return $null
    }
}


$xml = [xml] @"
<document>
    <My></My>
    <Your/>
    <sometags>
        <his>
        </his>
        <hasContent>sdfaf</hasContent>
    </sometags>
</document>
"@

# The "magic" part is in this XPath expression

$nodes = $xml.SelectNodes("//*[count(@*) = 0 and count(child::*) = 0 and not(string-length(text())) > 0]")

$nodes | %{
    $_.ParentNode.RemoveChild($_)
}

Format-Xml $xml



回答2:


I'm not fluent in powershell, so only a little addition to @DavidBrabant's good answer, specifically in the xpath part. xpath for detecting empty elements can be a bit simpler :

//*[not(@*) and not(*) and normalize-space()]

The predicates (everything within []), in order, checks if current element doesn't have attribute, doesn't have child element, and doesn't have empty text node.




回答3:


You should look for a solution that uses System.Xml.XmlDocument. But its also possible using regex:

$xml = @"
<document>
    <My></My>
    <Your/>
    <sometags>
        <his>
        </his>
        <hasContent>sdfaf</hasContent>
    </sometags>
</document>
"@

$xml -replace '(?:<(\w*)>\s*<\/\1>)|<(\w*)\/>', ''


来源:https://stackoverflow.com/questions/30474517/how-to-detect-and-remove-empty-xml-tags

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!