问题
I have an XML file with a plethora of nodes, each having a vast amount of attributes. For simplicity, let us assume the XML looking like this:
<?xml version="1.0" encoding="UTF-8"?>
<root>
<header />
<group>
<node1 attr1="x" attr2="y" attr3="z" />
<node2 attr4="x" attr5="y" attr6="z" />
<node3 attr7="x" attr8="y" attr9="z" />
<node1 attr1="x" attr2="y" attr3="z" />
</group>
</root>
I would like to reduce this XML into a smaller version by reducing the content of /root/group/
by eliminating both attributes as well as nodes.
- all nodes with name
node3
should be removed - The nodes with name
node1
should only have attributeattr1
- The nodes with name
node2
should only have attributesattr5
andattr6
I could write a simple XSLT for this by making use of simple if-match-do-nothing, eg.
<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output method="xml" encoding="UTF-8" indent="yes" />
<xsl:template match="/root/group/node3" />
<xsl:template match="/root/group/node1/@attr2" />
<xsl:template match="/root/group/node1/@attr3" />
<xsl:template match="/root/group/node2/@attr4" />
<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
This, however, does not fit my needs. The above states what I do not want, but I would like to state what I do want by making use of a whitelist Two questions I found answered this question partially. One question introduced the whitelist for the nodes, the other question introduced the whitelist for the attributes. How can I do this elegantly in a single whitelist or is there a better method? Can this be done in a whitelist of the form:
<whitelist>
<node1 attr1="" />
<node2 attr5="" attr6="" />
</whitelist>
Remark: I can only use XSLT-1.0
Expected output:
<?xml version="1.0" encoding="UTF-8"?>
<root>
<header />
<group>
<node1 attr1="x" />
<node2 attr5="y" attr6="z" />
<node1 attr1="x" />
</group>
</root>
relevant questions:
- XSLT - How to keep only wanted elements from XML
- XSL : Copy Attributes That Match A Whitelist
回答1:
Would this do it for you? Have a single template that match children of the group
elements, and then check the white list document to see whether to copy that node, and if so, what attributes should be copied too
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:ns="ns" version="1.0">
<xsl:output method="xml" encoding="UTF-8" indent="yes" />
<ns:WhiteList>
<node>
<name>node1</name>
<attr>attr1</attr>
</node>
<node>
<name>node2</name>
<attr>attr5</attr>
<attr>attr6</attr>
</node>
</ns:WhiteList>
<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="group/*">
<xsl:variable name="node" select="document('')//ns:WhiteList/node[name = name(current())]" />
<xsl:if test="$node">
<xsl:copy>
<xsl:apply-templates select="@*[name() = $node/attr]|node()" />
</xsl:copy>
</xsl:if>
</xsl:template>
</xsl:stylesheet>
回答2:
The simple way is to make your stylesheet itself be the "whitelist":
XSLT 1.0
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>
<!-- identity transform -->
<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="group">
<xsl:copy>
<xsl:apply-templates select="node1 | node2"/>
</xsl:copy>
</xsl:template>
<xsl:template match="node1">
<xsl:copy>
<xsl:apply-templates select="@attr1"/>
</xsl:copy>
</xsl:template>
<xsl:template match="node2">
<xsl:copy>
<xsl:apply-templates select="@attr5 | @attr6"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
Otherwise it can get pretty complicated:
It's relatively easy to test if a node appears in the given whitelist by its name (as they do on the other questions you linked to);
It is not so easy - esp. in XSLT 1.0 - to see if the node appears at the same position in the tree's hierarchy (i.e. that the path to it is the same as the path to a node in the whitelist).
If it's sufficient to test by name only, then you could do something like:
XSLT 1.0
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:my="http://example.com/my"
exclude-result-prefixes="my">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>
<my:whitelist>
<root>
<header/>
<group>
<node1 attr1=""/>
<node2 attr5="" attr6=""/>
</group>
</root>
</my:whitelist>
<xsl:variable name="whitelist" select="document('')/xsl:stylesheet/my:whitelist"/>
<xsl:template match="*">
<xsl:if test="$whitelist//*[name() = name(current())]">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:if>
</xsl:template>
<xsl:template match="@*">
<xsl:if test="$whitelist//@*[name() = name(current())]">
<xsl:copy/>
</xsl:if>
</xsl:template>
</xsl:stylesheet>
But then of course you could simplify the structure of the whitelist, since it's completely ignored.
For an example of how this could be done with a whitelist consisting of paths, see: https://stackoverflow.com/a/30276667/3016153
来源:https://stackoverflow.com/questions/54420288/keep-only-white-listed-elements-and-or-attributes