问题
I have the below RSS to parse, something like:
<?xml version="1.0" encoding="utf-8"?>
<rss xmlns:x-wr="http://www.w3.org/2002/12/cal/prod/Apple_Comp_628d9d8459c556fa#" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:x-example="http://www.example.com/rss/x-example" xmlns:x-microsoft="http://schemas.microsoft.com/x-microsoft" xmlns:xCal="urn:ietf:params:xml:ns:xcal" version="2.0">
<channel>
<item>
<title>About Apples</title>
<author>David K. Lowie</title>
<description>Some description about apples</description>
<xCal:description>This is the full description about apples</xCal:description>
</item>
<item>
<title>About Oranges</title>
<author>Marry L. Jones</title>
<description>Some description about oranges</description>
<xCal:description>This is the full description about oranges</xCal:description>
</item>
</channel>
</rss>
In PHP, i parse it something like:
$rss = new DOMDocument();
$rss->load( "http://www.example.com/books.rss" );
foreach( $rss->getElementsByTagName("item") as $node ) {
echo $node->getElementsByTagName("title")->item(0)->nodeValue,
echo $node->getElementsByTagName("author")->item(0)->nodeValue,
echo $node->getElementsByTagName("description")->item(0)->nodeValue,
echo $node->getElementsByTagName("xCal:description")->item(0)->nodeValue,
}
I can read everything except the xCal:description
node there. (The node names are exactly like that: description
and the xCal:description
.)
- How to parse (read) the nodes like
xCal:description
- Is it because of the similar node names, like:
description
and thexCal:description
?
(I can't change the RSS source since it's not under my control.)
Please kindly help.
回答1:
Use getElementsByTagNameNS():
$node->getElementsByTagNameNS("urn:ietf:params:xml:ns:xcal", "description")->item(0)->nodeValue
回答2:
While using the namespace aware variants of the DOM methods is a correct answer, you might want to take a look at Xpath. It is a much more comfortable way to fetch data from a DOM.
For the Xpath expression, you can register own prefixes for the namespaces as needed.
$rss = new DOMDocument();
$rss->load("http://www.example.com/books.rss");
$xpath = new DOMXpath($rss);
$xpath->registerNamespace('xc', 'urn:ietf:params:xml:ns:xcal');
foreach($xpath->evaluate("//item") as $item) {
echo $xpath->evaluate('string(title)', $item), "\n";
echo $xpath->evaluate('string(author)', $item), "\n";
echo $xpath->evaluate('string(description)', $item), "\n";
echo $xpath->evaluate('string(xc:description)', $item), "\n";
}
Output:
About Apples
David K. Lowie
Some description about apples
This is the full description about apples
About Oranges
Marry L. Jones
Some description about oranges
This is the full description about oranges
来源:https://stackoverflow.com/questions/38095199/php-domdocument-how-to-parse-custom-xml-rss-tag-names-with-colons