PHP DOMDocument replace DOMElement child with HTML string

穿精又带淫゛_ 提交于 2019-12-30 03:17:08

问题


Using PHP I'm attempting to take an HTML string passed from a WYSIWYG editor and replace the children of an element inside of a preloaded HTML document with the new HTML.

So far I'm loading the document identifying the element I want to change by ID but the process to convert an HTML to something that can be placed inside a DOMElement is eluding me.

libxml_use_internal_errors(true);

$doc = new DOMDocument();
$doc->loadHTML($html);

$element = $doc->getElementById($item_id);
if(isset($element)){
    //Remove the old children from the element
    while($element->childNodes->length){
        $element->removeChild($element->firstChild);
    }

    //Need to build the new children from $html_string and append to $element
}

回答1:


If the HTML string can be parsed as XML, you can do this (after clearing the element of all child nodes):

$fragment = $doc->createDocumentFragment();
$fragment->appendXML($html_string);
$element->appendChild($fragment);

If $html_string cannot be parsed as XML, it will fail. If it does, you’ll have to use loadHTML(), which is less strict — but it will add elements around the fragment which you will have to strip.

Unlike PHP, Javascript has the innerHTML property which allows you to do this very easily. I needed something like it for a project so I extended PHP’s DOMElement to include Javascript-like innerHTML access.

With it you can access the innerHTML property and change it just as you would in Javascript:

echo $element->innerHTML;
$elem->innerHTML = '<a href="http://example.org">example</a>';

Source: http://www.keyvan.net/2012/11/php-domdocument-replace-domelement-child-with-html-string/




回答2:


You can use loadHTML() on a fragment of code and then append the resulting created nodes into the original DOM tree.




回答3:


The current accepted answer suggests using appendXML(), but acknowledges that it won't handle complex html such as what is returned from a WYSISYG editor as specified in the original question. As suggested loadHTML() can address this. but no one has yet shown how.

This is what I believe is the best/correct answer to the original question addressing encoding issues, "Document Fragment is empty" warnings and "Wrong Document Error" errors that someone is likely to hit if they write this from scratch. I know I found them after following the hints in the previous responses.

This is code from a site I support that inserts WordPress sidebar content into the $content of a post. It assumes that $doc is a valid DOMDocument similar to the way $doc is defined in the original question. It also assumes that $element is the tag after which you wish to insert the sidebarcontent (or whatever).

            // NOTE: Cannot use a document fragment here as the AMP html is too complex for the appendXML function to accept.
            // Instead create it as a document element and insert that way.
            $node = new DOMDocument();
            // Note that we must encode it correctly or strange characters may appear.
            $node->loadHTML( mb_convert_encoding( $sidebarContent, 'HTML-ENTITIES', 'UTF-8') );
            // Now we need to move this document element into the scope of the content document 
            // created above or the insert/append will be rejected.
            $node = $doc->importNode( $node->documentElement, true );
            // If there is a next sibling, insert before it.
            // If not, just add it at the end of the element we did find.
            if (  $element->nextSibling ) {
                $element->parentNode->insertBefore( $node, $element->nextSibling );
            } else {
                $element->parentNode->appendChild($node);
            }

After all of this is done, if you don't want to have the source of a full HTML document with body tags and what not, you can generate the more localized html with this:

    // Now because we have moved the post content into a full document, we need to get rid of the 
    // extra elements that make it a document and not a fragment
    $body = $doc->getElementsByTagName( 'body' );
    $body = $body->item(0);

    // If you need an element with a body tag, you can do this.
    // return $doc->savehtml( $body );

    // Extract the html from the body tag piece by piece to ensure valid html syntax in destination document
    $bodyContent = ''; 
    foreach( $body->childNodes as $node ) { 
            $bodyContent .= $body->ownerDocument->saveHTML( $node ); 
    } 
    // Now return the full content with the new content added. 
    return $bodyContent;



回答4:


I know this is an old thread (but reply on this because also looking for a solution to this). I have made an easy method to replace content with just one single line when using it. To understand the method better, I also add some context named functions.

This is now a part of my library, so that's the reason of all function names here, all functions starts with the prefix 'su'.

It is very easy to use and very powerful (and quite less code).

Here is the code:

function suSetHtmlElementById( &$oDoc, &$s, $sId, $sHtml, $bAppend = false, $bInsert = false, $bAddToOuter = false )
 {
    if( suIsValidString( $s ) && suIsValidString( $sId ))
    {
     $bCreate = true;
     if( is_object( $oDoc ))
     {
       if( !( $oDoc instanceof DOMDocument ))
        { return false; }
       $bCreate = false;
     }

     if( $bCreate )
      { $oDoc = new DOMDocument(); }

     libxml_use_internal_errors(true);
     $oDoc->loadHTML($s);
     libxml_use_internal_errors(false);
     $oNode = $oDoc->getElementById( $sId );

     if( is_object( $oNode ))
     { 
       $bReplaceOuter = ( !$bAppend && !$bInsert );

       $sId = uniqid('SHEBI-');
       $aId = array( "<!-- $sId -->", "<!--$sId-->" );

       if( $bReplaceOuter )
       {
         if( suIsValidString( $sHtml ) )
         {
             $oNode->parentNode->replaceChild( $oDoc->createComment( $sId ), $oNode );
             $s = $oDoc->saveHtml();
             $s = str_replace( $aId, $sHtml, $oDoc->saveHtml());
         }
         else { $oNode->parentNode->removeChild( $oNode ); 
                $s = $oDoc->saveHtml();
              }
         return true;
       }

       $bReplaceInner = ( $bAppend && $bInsert );
       $sThis = null;

       if( !$bReplaceInner )
       {
         $sThis = $oDoc->saveHTML( $oNode );
         $sThis = ($bInsert?$sHtml:'').($bAddToOuter?$sThis:(substr($sThis,strpos($sThis,'>')+1,-(strlen($oNode->nodeName)+3)))).($bAppend?$sHtml:''); 
       }

       if( !$bReplaceInner && $bAddToOuter )
       { 
          $oNode->parentNode->replaceChild( $oDoc->createComment( $sId ), $oNode );
          $sId = &$aId;
       }
       else { $oNode->nodeValue = $sId; }

       $s = str_replace( $sId, $bReplaceInner?$sHtml:$sThis, $oDoc->saveHtml());
       return true;
     }
    } 
    return false; 
 }

// A function of my library used in the function above:
function suIsValidString( &$s, &$iLen = null, $minLen = null, $maxLen = null )
{
  if( !is_string( $s ) || !isset( $s{0} ))
   { return false; }

  if( $iLen !== null )
   { $iLen = strlen( $s ); }

  return (( $minLen===null?true:($minLen > 0 && isset( $s{$minLen-1} ))) && 
           $maxLen===null?true:($maxLen >= $minLen && !isset( $s{$maxLen})));   
}   

Some context functions:

 function suAppendHtmlById( &$s, $sId, $sHtml, &$oDoc = null )
 { return suSetHtmlElementById( $oDoc, $s, $sId, $sHtml, true, false ); }

 function suInsertHtmlById( &$s, $sId, $sHtml, &$oDoc = null )
 { return suSetHtmlElementById( $oDoc, $s, $sId, $sHtml, false, true ); }

 function suAddHtmlBeforeById( &$s, $sId, $sHtml, &$oDoc = null )
 { return suSetHtmlElementById( $oDoc, $s, $sId, $sHtml, false, true, true ); }

 function suAddHtmlAfterById( &$s, $sId, $sHtml, &$oDoc = null )
 { return suSetHtmlElementById( $oDoc, $s, $sId, $sHtml, true, false, true ); }

 function suSetHtmlById( &$s, $sId, $sHtml, &$oDoc = null )
 { return suSetHtmlElementById( $oDoc, $s, $sId, $sHtml, true, true ); }

 function suReplaceHtmlElementById( &$s, $sId, $sHtml, &$oDoc = null )
 { return suSetHtmlElementById( $oDoc, $s, $sId, $sHtml, false, false ); }

 function suRemoveHtmlElementById( &$s, $sId, &$oDoc = null )
 { return suSetHtmlElementById( $oDoc, $s, $sId, null, false, false ); }

How to use it:

In the following examples, I assume that there is already content loaded into a variable called $sMyHtml and the variable $sMyNewContent contains some new html. The variable $sMyHtml contains an element called/with the id 'example_id'.

// Example 1: Append new content to the innerHTML of an element (bottom of element):
if( suAppendHtmlById( $sMyHtml, 'example_id', $sMyNewContent ))
 { echo $sMyHtml; }
 else { echo 'Element not found?'; }

// Example 2: Insert new content to the innerHTML of an element (top of element):
suInsertHtmlById( $sMyHtml, 'example_id', $sMyNewContent );    

// Example 3: Add new content ABOVE element:
suAddHtmlBeforeById( $sMyHtml, 'example_id', $sMyNewContent );    

// Example 3: Add new content BELOW/NEXT TO element:
suAddHtmlAfterById( $sMyHtml, 'example_id', $sMyNewContent );    

// Example 4: SET new innerHTML content of element:
suSetHtmlById( $sMyHtml, 'example_id', $sMyNewContent );    

// Example 5: Replace entire element with new content:
suReplaceHtmlElementById( $sMyHtml, 'example_id', $sMyNewContent );    

// Example 6: Remove entire element:
suSetHtmlElementById( $sMyHtml, 'example_id' ); 


来源:https://stackoverflow.com/questions/2233683/php-domdocument-replace-domelement-child-with-html-string

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!