问题
I have that kind of XML file
myxml.xml
<?xml version="1.0" encoding="utf-8"?>
<products nb="2" type="new">
<product ean="12345677654321">
<sku>Product1</sku>
<parameters>
<short_desc> Short description of the product1 </short_desc>
<price currency="USD">19.65</price>
</parameters>
</product>
<product ean="12345644654321">
<sku>Product2</sku>
<parameters>
<long_desc> Long description of the product2 </long_desc>
<price currency="USD">19.65</price>
<vat>20</vat>
</parameters>
</product>
</products>
I would an array like this
/products/@nb
/products/@type
/products/product/@ean
/products/product/sku
/products/product/parameters/short_desc
/products/product/parameters/long_desc
/products/product/parameters/price
/products/product/parameters/price/@currency
/products/product/parameters/vat
I almost this result with this code
getpath.xsl
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:variable name="vApos">'</xsl:variable>
<xsl:template match="*[@* or not(*)] ">
<xsl:if test="not(*)">
<xsl:apply-templates select="ancestor-or-self::*" mode="path"/>
<xsl:text>
</xsl:text>
</xsl:if>
<xsl:apply-templates select="@*|*"/>
</xsl:template>
<xsl:template match="*" mode="path">
<xsl:value-of select="concat('/',name())"/>
<xsl:variable name="vnumPrecSiblings" select=
"count(preceding-sibling::*[name()=name(current())])"/>
<xsl:if test="$vnumPrecSiblings">
<xsl:value-of select="concat('[', $vnumPrecSiblings +1, ']')"/>
</xsl:if>
</xsl:template>
<xsl:template match="@*">
<xsl:apply-templates select="../ancestor-or-self::*" mode="path"/>
<xsl:value-of select="concat('/@',name())"/>
<xsl:text>
</xsl:text>
</xsl:template>
</xsl:stylesheet>
$xslDoc = new \DOMDocument();
$xslDoc->substituteEntities = true;
$xslDoc->load('getpath.xsl');
$xmlDoc = new \DOMDocument();
$xmlDoc->load('myxml.xml');
$proc = new \XSLTProcessor();
$proc->importStylesheet($xslDoc);
$rest = $proc->transformToXML($xmlDoc);
$res = preg_replace("/\\s/"," ", $rest);
$path = explode(" ", $res);
foreach ($path as $key => $value) {
if(!empty($value) && !preg_match("/\[.*\]/", $value))
$fields[] = $value;
}
return $fields;
This code give me
/products/@nb
/products/@type
/products/product/@ean
/products/product/sku
/products/product/parameters/short_desc
/products/product/parameters/price
/products/product/parameters/price/@currency
/products/product/parameters/long_desc and /products/product/parameters/price/vat are missing :(
How can I parse the full XML with xslt ? Or have you a solution without XSLT ???
回答1:
Yeah you can do it with some Xpath in PHP.
$dom = new DOMDocument();
$dom->loadXml($xml);
$xpath = new DOMXpath($dom);
function getNodeExpression(DOMNode $node, array &$namespaces) {
$name = $node->localName;
$namespace = $node->namespaceURI;
if ($namespace == '') {
return ($node instanceOf DOMAttr ? '@' : '').$name;
} elseif (isset($namespaces[$namespace])) {
$prefix = $namespaces[$namespace];
} else {
$xmlns = $prefix = ($node->prefix == '') ? 'ns' : $node->prefix;
$i = 1;
while (in_array($xmlns, $namespaces)) {
$xmlns = $prefix.'-'.$i;
$i++;
}
$namespaces[$namespace] = $prefix;
}
return ($node instanceOf DOMAttr ? '@' : '').$prefix.':'.$name;
}
$result = [];
$namespaces= [];
foreach ($xpath->evaluate('//*[count(*) = 0]|//@*') as $node) {
$path = '';
foreach ($xpath->evaluate('ancestor::*', $node) as $parent) {
$path = '/'.getNodeExpression($parent, $namespaces);
}
$path .= '/'.getNodeExpression($node, $namespaces);
$result[$path] = TRUE;
}
Output: https://eval.in/118054
array(10) {
[0]=>
string(13) "/products/@nb"
[1]=>
string(15) "/products/@type"
[2]=>
string(13) "/product/@ean"
[3]=>
string(12) "/product/sku"
[4]=>
string(22) "/parameters/short_desc"
[5]=>
string(17) "/parameters/price"
[6]=>
string(16) "/price/@currency"
[7]=>
string(21) "/parameters/long_desc"
[8]=>
string(20) "/long_desc/@xml:lang"
[9]=>
string(15) "/parameters/vat"
}
array(1) {
["http://www.w3.org/XML/1998/namespace"]=>
string(3) "xml"
}
The complex part at this is to resolve the namespaces and generate prefixes for them. So let's take a detailed look:
Get the local name (tag name without namespace prefix) and the namespace.
$name = $node->localName;
$namespace = $node->namespaceURI;
If the namespace is empty we do not need any prefix return an expression with just the node name.
if ($namespace == '') {
return ($node instanceOf DOMAttr ? '@' : '').$name;
Otherwise check if the namespace was already used on another node and reuse that prefix.
} elseif (isset($namespaces[$namespace])) {
$prefix = $namespaces[$namespace];
If here is an unknown namespace, read the prefix used on this node. If the node didn't use a prefix use the string "ns".
} else {
$xmlns = $prefix = ($node->prefix == '') ? 'ns' : $node->prefix;
Validate that the prefix is not already used for another namespace add a number and increase it until we have an unique prefix.
$i = 1;
while (in_array($xmlns, $namespaces)) {
$xmlns = $prefix.'-'.$i;
$i++;
}
Store the namespace => prefix definition for the next call.
$namespaces[$namespace] = $prefix;
Return an expression including the prefix.
return ($node instanceOf DOMAttr ? '@' : '').$prefix.':'.$name;
The namespace array can be used to register all needed namespace prefix on an Xpath object.
来源:https://stackoverflow.com/questions/22310131/get-all-leaf-from-complex-xml-with-attributes