问题
I've got a bunch of XML files, I wish to detect and remove empty tags inside them. like:
<My></My>
<Your/>
<sometags>
<his>
</his>
<hasContent>sdfaf</hasContent>
</sometags>
They're all kinds of empty tags (My
, Your
, his
) I wish to remove. Does PowerShell support such kind of empty tag detection, no matter how deep they're embedded inside other tags?
回答1:
function Format-XML
{
param (
[parameter(Mandatory = $true)][xml] $xml,
[parameter(Mandatory = $false)][int] $indent = 4
)
try
{
$Error.Clear()
$StringWriter = New-Object System.IO.StringWriter
$XmlWriter = New-Object System.XMl.XmlTextWriter $StringWriter
$xmlWriter.Formatting = "indented"
$xmlWriter.Indentation = $indent
$xml.WriteContentTo($XmlWriter)
$XmlWriter.Flush()
$StringWriter.Flush()
return $StringWriter.ToString()
}
catch
{
Write-Host "$($MyInvocation.InvocationName): $_"; return $null
}
}
$xml = [xml] @"
<document>
<My></My>
<Your/>
<sometags>
<his>
</his>
<hasContent>sdfaf</hasContent>
</sometags>
</document>
"@
# The "magic" part is in this XPath expression
$nodes = $xml.SelectNodes("//*[count(@*) = 0 and count(child::*) = 0 and not(string-length(text())) > 0]")
$nodes | %{
$_.ParentNode.RemoveChild($_)
}
Format-Xml $xml
回答2:
I'm not fluent in powershell, so only a little addition to @DavidBrabant's good answer, specifically in the xpath part. xpath for detecting empty elements can be a bit simpler :
//*[not(@*) and not(*) and normalize-space()]
The predicates (everything within []
), in order, checks if current element doesn't have attribute, doesn't have child element, and doesn't have empty text node.
回答3:
You should look for a solution that uses System.Xml.XmlDocument. But its also possible using regex:
$xml = @"
<document>
<My></My>
<Your/>
<sometags>
<his>
</his>
<hasContent>sdfaf</hasContent>
</sometags>
</document>
"@
$xml -replace '(?:<(\w*)>\s*<\/\1>)|<(\w*)\/>', ''
来源:https://stackoverflow.com/questions/30474517/how-to-detect-and-remove-empty-xml-tags