How to import XML string in a php DOMDocument

天涯浪子 提交于 2019-12-30 02:21:32

问题


For exemple, i create a DOMDocument like that :

<?php

$implementation = new DOMImplementation();

$dtd =
  $implementation->createDocumentType
  (
    'html',                                     // qualifiedName
    '-//W3C//DTD XHTML 1.0 Transitional//EN',   // publicId
    'http://www.w3.org/TR/xhtml1/DTD/xhtml1-'
      .'transitional.dtd'                       // systemId
  );

$document = $implementation->createDocument('', '', $dtd);

$elementHtml     = $document->createElement('html');
$elementHead     = $document->createElement('head');
$elementBody     = $document->createElement('body');
$elementTitle    = $document->createElement('title');
$textTitre       = $document->createTextNode('My bweb page');
$attrLang        = $document->createAttribute('lang');
$attrLang->value = 'en';

$document->appendChild($elementHtml);
$elementHtml->appendChild($elementHead);
$elementHtml->appendChild($attrLang);
$elementHead->appendChild($elementTitle);
$elementTitle->appendChild($textTitre);
$elementHtml->appendChild($elementBody);

So, now, if i have some xhtml string like that :

<?php
$xhtml = '<h1>Hello</h1><p>World</p>';

How can i import it in the <body> node of my DOMDocument ?

For now, the only solution I've found, is something like that :

<?php
$simpleXmlElement = new SimpleXMLElement($xhtml);

$domElement = dom_import_simplexml($simpleXmlElement);

$domElement = $document->importNode($domElement, true);

$elementBody->appendChild($domElement);

This solution seems very bad for me, and create some problemes, like when I try with a string like that :

<?php
$xhtml = '<p>Hello&nbsp;World</p>';

Ok, I can bypass this problem by converting xhtml entities in Unicode entities, but it's so ugly...

Any help ?

Thanks by advance !

Related question :

  • DOMDocument::validate() problem (solved)

回答1:


The problem is DOM does not know that it should consider the XHTML DTD unless you validated the document against it. Unless you do that, DOM doesnt know any entities defined in the DTD, nor any other rules in it. Fortunately, we sorted out how to do the validation in that other question, so armed with that knowledge you can do

$document->validate(); // anywhere before importing the other DOM

And then import with

$fragment = $document->createDocumentFragment();
$fragment->appendXML('<h1>Hello</h1><p>Hello&nbsp;World</p>');
$document->getElementsByTagName('body')->item(0)->appendChild($fragment);
$document->formatOutput = TRUE;
echo $document->saveXml();

outputs:

<?xml version="1.0"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en">
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
    <title>My bweb page</title>
  </head>
  <body>
    <h1>Hello</h1>
    <p>Hello&nbsp;World</p>
  </body>
</html>

The other way to import XML into another DOM is to use

$one = new DOMDocument;
$two = new DOMDocument;
$one->loadXml('<root><foo>one</foo></root>');
$two->loadXml('<root><bar><sub>two</sub></bar></root>');
$bar = $two->documentElement->firstChild; // we want to import the bar tree
$one->documentElement->appendChild($one->importNode($bar, TRUE));
echo $one->saveXml();

outputs:

<?xml version="1.0"?>
<root><foo>one</foo><bar><sub>two</sub></bar></root>

However, this cannot work with

<h1>Hello</h1><p>Hello&nbsp;World</p>

because when you load a document into DOM, DOM will overwrite everything you told it before about the document. Thus, when using load, libxml (and thus SimpleXml, DOM and XMLReader) does (do) not know you mean XHTML. And it does not know any entities defined in it and will fuzz about them instead. But even if the string would not contain the entity, it is not valid XML, because it lacks a root node. That's why you use the fragment.




回答2:


You can use a DomDocumentFragment for this:

$fragment = $document->createDocumentFragment();
$fragment->appendXml($xhtml);
$elementBody->appendChild($fragment);

That's all there is to it...

Edit: Well, if you must have xhtml (instead of valid xml), you could do this dirty workaround:

function xhtmlToDomNode($xhtml) {
    $dom = new DomDocument();
    $dom->loadHtml('<html><body>'.$xhtml.'</body></html>');
    $fragment = $dom->createDocumentFragment();
    $body = $dom->getElementByTagName('body')->item(0);
    foreach ($body->childNodes as $child) {
        $fragment->appendChild($child);
    }
    return $fragment;
}

usage:

$fragment = xhtmlToDomNode($xhtml);
$document->importNode($fragment, true);
$elementBody->appendChild($fragment);


来源:https://stackoverflow.com/questions/4081090/how-to-import-xml-string-in-a-php-domdocument

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!