CakePHP Xml utility library triggers DOMDocument warning

不想你离开。 提交于 2019-12-17 07:52:23

问题


I'm generating XML in a view with CakePHP's Xml core library:

$xml = Xml::build($data, array('return' => 'domdocument'));
echo $xml->saveXML();

View is fed from the controller with an array:

$this->set(
    array(
        'data' => array(
            'root' => array(
                array(
                    '@id' => 'A & B: OK',
                    'name' => 'C & D: OK',
                    'sub1' => array(
                        '@id' => 'E & F: OK',
                        'name' => 'G & H: OK',
                        'sub2' => array(
                            array(
                                '@id' => 'I & J: OK',
                                'name' => 'K & L: OK',
                                'sub3' => array(
                                    '@id' => 'M & N: OK',
                                    'name' => 'O & P: OK',
                                    'sub4' => array(
                                        '@id' => 'Q & R: OK',
                                        '@'   => 'S & T: ERROR',
                                    ),
                                ),
                            ),
                        ),
                    ),
                ),
            ),
        ),
    )
);

For whatever the reason, CakePHP is issuing an internal call like this:

$dom = new DOMDocument;
$key = 'sub4';
$childValue = 'S & T: ERROR';
$dom->createElement($key, $childValue);

... which triggers a PHP warning:

Warning (2): DOMDocument::createElement(): unterminated entity reference               T [CORE\Cake\Utility\Xml.php, line 292

... because (as documented), DOMDocument::createElement does not escape values. However, it only does it in certain nodes, as the test case illustrates.

Am I doing something wrong or I just hit a bug in CakePHP?


回答1:


This is a bug in PHPs DOMDocument::createElement() method. Here are two ways to avoid the problem.

Create Text Nodes

Create the textnode separately and append it to the element node.

$dom = new DOMDocument;
$dom
  ->appendChild($dom->createElement('element'))
  ->appendChild($dom->createTextNode('S & T: ERROR'));

var_dump($dom->saveXml());

Output:

string(58) "<?xml version="1.0"?>
<element>S &amp; T: ERROR</element>
"

This is the originally intended way to add text nodes to a DOM. You always create a node (element, text , cdata, ...) and append it to its parent node. You can add more then one node and different kind of nodes to one parent. Like in the following example:

$dom = new DOMDocument;
$p = $dom->appendChild($dom->createElement('p'));
$p->appendChild($dom->createTextNode('Hello '));
$b = $p->appendChild($dom->createElement('b'));
$b->appendChild($dom->createTextNode('World!'));

echo $dom->saveXml();

Output:

<?xml version="1.0"?>
<p>Hello <b>World!</b></p>

Property DOMNode::$textContent

DOM Level 3 introduced a new node property called textContent. It abstracts the content/value of a node depending on the node type. Setting the $textContent of an element node will replace all its child nodes with a single text node. Reading it returns the content of all descendant text nodes.

$dom = new DOMDocument;
$dom
  ->appendChild($dom->createElement('element'))
  ->textContent = 'S & T: ERROR';

var_dump($dom->saveXml());



回答2:


This is in fact because the DOMDocument methods wants correct characters to be outputted in html; that is, characters such as & will break content and generate a unterminated entity reference error

just htmlentities() it before using it to create elements:

$dom = new DOMDocument;
$key = 'sub4';
$childValue = htmlentities('S & T: ERROR');
$dom->createElement($key ,$childValue);



回答3:


it is because of this character: & You need to replace that with the relevant HTML entity. &amp; To perform the translation, you can use the htmlspecialchars function. You have to escape the value when writing writing to the nodeValue property. As quoted from a bug report in 2005 located here

ampersands ARE properly encoded when setting the property textContent. Unfortunately they are not encoded when the text string is passed as the optional second arguement to DOMElement::createElement You must create a text node, set the textContent, then append the text node to the new element.

htmlspecialchars($string, ENT_QUOTES, 'UTF-8');

This is the translation table:

'&' (ampersand) becomes '&amp;'
'"' (double quote) becomes '&quot;' when ENT_NOQUOTES is not set.
"'" (single quote) becomes '&#039;' (or &apos;) only when ENT_QUOTES is set.
'<' (less than) becomes '&lt;'
'>' (greater than) becomes '&gt;'

This script will do the translations recursively:

<?php
function clean($type) {
  if(is_array($type)) {
    foreach($type as $key => $value){   
     $type[$key] = clean($value);
    }
    return $type;
  } else {
    $string = htmlspecialchars($type, ENT_QUOTES, 'UTF-8');
    return $string;
  }
}

$data = array(
    'data' => array(
        'root' => array(
            array(
                '@id' => 'A & B: OK',
                'name' => 'C & D: OK',
                'sub1' => array(
                    '@id' => 'E & F: OK',
                    'name' => 'G & H: OK',
                    'sub2' => array(
                        array(
                            '@id' => 'I & J: OK',
                            'name' => 'K & L: OK',
                            'sub3' => array(
                                '@id' => 'M & N: OK',
                                'name' => 'O & P: OK',
                                'sub4' => array(
                                    '@id' => 'Q & R: OK',
                                    '@' => 'S & T: ERROR',
                                ) ,
                            ) ,
                        ) ,
                    ) ,
                ) ,
            ) ,
        ) ,
    ) ,
);

$data = clean($data);

Output

Array
(
    [data] => Array
        (
            [root] => Array
                (
                    [0] => Array
                        (
                            [@id] => A &amp; B: OK
                            [name] => C &amp; D: OK
                            [sub1] => Array
                                (
                                    [@id] => E &amp; F: OK
                                    [name] => G &amp; H: OK
                                    [sub2] => Array
                                        (
                                            [0] => Array
                                                (
                                                    [@id] => I &amp; J: OK
                                                    [name] => K &amp; L: OK
                                                    [sub3] => Array
                                                        (
                                                            [@id] => M &amp; N: OK
                                                            [name] => O &amp; P: OK
                                                            [sub4] => Array
                                                                (
                                                                    [@id] => Q &amp; R: OK
                                                                    [@] => S &amp; T: ERROR
                                                                )

                                                        )

                                                )

                                        )

                                )

                        )

                )

        )

)



回答4:


The problem seems to be in nodes that have both attributes and values thus need to use the @ syntax:

'@id' => 'A & B: OK',  // <-- Handled as plain text
'name' => 'C & D: OK', // <-- Handled as plain text
'@' => 'S & T: ERROR', // <-- Handled as raw XML

I've written a little helper function:

protected function escapeXmlValue($value){
    return is_null($value) ? null : htmlspecialchars($value, ENT_XML1, 'UTF-8');
}

... and take care of calling it manually when I create the array:

'@id' => 'A & B: OK',
'name' => 'C & D: OK',
'@' => $this->escapeXmlValue('S & T: NOW WORKS FINE'),

It's hard to say if it's bug or feature since the documentation doesn't mention it.



来源:https://stackoverflow.com/questions/22956330/cakephp-xml-utility-library-triggers-domdocument-warning

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!