PHP DOMDocument replace DOMElement child with HTML string

后端 未结 5 1495
[愿得一人]
[愿得一人] 2020-12-31 13:09

Using PHP I\'m attempting to take an HTML string passed from a WYSIWYG editor and replace the children of an element inside of a preloaded HTML document with the new HTML.

相关标签:
5条回答
  • 2020-12-31 13:51

    I know this is old, but none of the current answers show a minimal working example of how to replace DOMNode(s) in a DOMDocument with an HTML stored in a string.

    // the HTML fragment we want to use as the replacement
    $htmlReplace = '<div><strong>foo</strong></div>';
    // the HTML of the original document
    $htmlHaystack = '<p><a id="tag">bar</a></p>';
    
    // load the HTML replacement fragment
    $domDocumentReplace = new \DOMDocument;
    $domDocumentReplace->loadHTML($htmlReplace, LIBXML_HTML_NOIMPLIED);
    
    // load the HTML of the document
    $domDocumentHaystack = new \DOMDocument;
    $domDocumentHaystack->loadHTML($htmlHaystack, LIBXML_HTML_NOIMPLIED);
    
    // import the replacement node into the document
    $htmlReplaceNode = $domDocumentHaystack->importNode($domDocumentReplace->documentElement, true);
    
    // find the DOMNode(s) we want to replace - in this case #tag (to keep the example simple)
    $domNodeTag = $domDocumentHaystack->getElementById('tag');
    
    // replace the node
    $domNodeTag->parentNode->replaceChild($htmlReplaceNode, $domNodeTag);
    
    // output the new HTML of the document
    echo $domDocumentHaystack->saveHTML($domDocumentHaystack->documentElement);
    // <p><div><strong>foo</strong></div></p>
    
    0 讨论(0)
  • 2020-12-31 13:58

    The current accepted answer suggests using appendXML(), but acknowledges that it won't handle complex html such as what is returned from a WYSISYG editor as specified in the original question. As suggested loadHTML() can address this. but no one has yet shown how.

    This is what I believe is the best/correct answer to the original question addressing encoding issues, "Document Fragment is empty" warnings and "Wrong Document Error" errors that someone is likely to hit if they write this from scratch. I know I found them after following the hints in the previous responses.

    This is code from a site I support that inserts WordPress sidebar content into the $content of a post. It assumes that $doc is a valid DOMDocument similar to the way $doc is defined in the original question. It also assumes that $element is the tag after which you wish to insert the sidebarcontent (or whatever).

                // NOTE: Cannot use a document fragment here as the AMP html is too complex for the appendXML function to accept.
                // Instead create it as a document element and insert that way.
                $node = new DOMDocument();
                // Note that we must encode it correctly or strange characters may appear.
                $node->loadHTML( mb_convert_encoding( $sidebarContent, 'HTML-ENTITIES', 'UTF-8') );
                // Now we need to move this document element into the scope of the content document 
                // created above or the insert/append will be rejected.
                $node = $doc->importNode( $node->documentElement, true );
                // If there is a next sibling, insert before it.
                // If not, just add it at the end of the element we did find.
                if (  $element->nextSibling ) {
                    $element->parentNode->insertBefore( $node, $element->nextSibling );
                } else {
                    $element->parentNode->appendChild($node);
                }
    

    After all of this is done, if you don't want to have the source of a full HTML document with body tags and what not, you can generate the more localized html with this:

        // Now because we have moved the post content into a full document, we need to get rid of the 
        // extra elements that make it a document and not a fragment
        $body = $doc->getElementsByTagName( 'body' );
        $body = $body->item(0);
    
        // If you need an element with a body tag, you can do this.
        // return $doc->savehtml( $body );
    
        // Extract the html from the body tag piece by piece to ensure valid html syntax in destination document
        $bodyContent = ''; 
        foreach( $body->childNodes as $node ) { 
                $bodyContent .= $body->ownerDocument->saveHTML( $node ); 
        } 
        // Now return the full content with the new content added. 
        return $bodyContent;
    
    0 讨论(0)
  • 2020-12-31 13:58

    I know this is an old thread (but reply on this because also looking for a solution to this). I have made an easy method to replace content with just one single line when using it. To understand the method better, I also add some context named functions.

    This is now a part of my library, so that's the reason of all function names here, all functions starts with the prefix 'su'.

    It is very easy to use and very powerful (and quite less code).

    Here is the code:

    function suSetHtmlElementById( &$oDoc, &$s, $sId, $sHtml, $bAppend = false, $bInsert = false, $bAddToOuter = false )
     {
        if( suIsValidString( $s ) && suIsValidString( $sId ))
        {
         $bCreate = true;
         if( is_object( $oDoc ))
         {
           if( !( $oDoc instanceof DOMDocument ))
            { return false; }
           $bCreate = false;
         }
    
         if( $bCreate )
          { $oDoc = new DOMDocument(); }
    
         libxml_use_internal_errors(true);
         $oDoc->loadHTML($s);
         libxml_use_internal_errors(false);
         $oNode = $oDoc->getElementById( $sId );
    
         if( is_object( $oNode ))
         { 
           $bReplaceOuter = ( !$bAppend && !$bInsert );
    
           $sId = uniqid('SHEBI-');
           $aId = array( "<!-- $sId -->", "<!--$sId-->" );
    
           if( $bReplaceOuter )
           {
             if( suIsValidString( $sHtml ) )
             {
                 $oNode->parentNode->replaceChild( $oDoc->createComment( $sId ), $oNode );
                 $s = $oDoc->saveHtml();
                 $s = str_replace( $aId, $sHtml, $oDoc->saveHtml());
             }
             else { $oNode->parentNode->removeChild( $oNode ); 
                    $s = $oDoc->saveHtml();
                  }
             return true;
           }
    
           $bReplaceInner = ( $bAppend && $bInsert );
           $sThis = null;
    
           if( !$bReplaceInner )
           {
             $sThis = $oDoc->saveHTML( $oNode );
             $sThis = ($bInsert?$sHtml:'').($bAddToOuter?$sThis:(substr($sThis,strpos($sThis,'>')+1,-(strlen($oNode->nodeName)+3)))).($bAppend?$sHtml:''); 
           }
    
           if( !$bReplaceInner && $bAddToOuter )
           { 
              $oNode->parentNode->replaceChild( $oDoc->createComment( $sId ), $oNode );
              $sId = &$aId;
           }
           else { $oNode->nodeValue = $sId; }
    
           $s = str_replace( $sId, $bReplaceInner?$sHtml:$sThis, $oDoc->saveHtml());
           return true;
         }
        } 
        return false; 
     }
    
    // A function of my library used in the function above:
    function suIsValidString( &$s, &$iLen = null, $minLen = null, $maxLen = null )
    {
      if( !is_string( $s ) || !isset( $s{0} ))
       { return false; }
    
      if( $iLen !== null )
       { $iLen = strlen( $s ); }
    
      return (( $minLen===null?true:($minLen > 0 && isset( $s{$minLen-1} ))) && 
               $maxLen===null?true:($maxLen >= $minLen && !isset( $s{$maxLen})));   
    }   
    

    Some context functions:

     function suAppendHtmlById( &$s, $sId, $sHtml, &$oDoc = null )
     { return suSetHtmlElementById( $oDoc, $s, $sId, $sHtml, true, false ); }
    
     function suInsertHtmlById( &$s, $sId, $sHtml, &$oDoc = null )
     { return suSetHtmlElementById( $oDoc, $s, $sId, $sHtml, false, true ); }
    
     function suAddHtmlBeforeById( &$s, $sId, $sHtml, &$oDoc = null )
     { return suSetHtmlElementById( $oDoc, $s, $sId, $sHtml, false, true, true ); }
    
     function suAddHtmlAfterById( &$s, $sId, $sHtml, &$oDoc = null )
     { return suSetHtmlElementById( $oDoc, $s, $sId, $sHtml, true, false, true ); }
    
     function suSetHtmlById( &$s, $sId, $sHtml, &$oDoc = null )
     { return suSetHtmlElementById( $oDoc, $s, $sId, $sHtml, true, true ); }
    
     function suReplaceHtmlElementById( &$s, $sId, $sHtml, &$oDoc = null )
     { return suSetHtmlElementById( $oDoc, $s, $sId, $sHtml, false, false ); }
    
     function suRemoveHtmlElementById( &$s, $sId, &$oDoc = null )
     { return suSetHtmlElementById( $oDoc, $s, $sId, null, false, false ); }
    

    How to use it:

    In the following examples, I assume that there is already content loaded into a variable called $sMyHtml and the variable $sMyNewContent contains some new html. The variable $sMyHtml contains an element called/with the id 'example_id'.

    // Example 1: Append new content to the innerHTML of an element (bottom of element):
    if( suAppendHtmlById( $sMyHtml, 'example_id', $sMyNewContent ))
     { echo $sMyHtml; }
     else { echo 'Element not found?'; }
    
    // Example 2: Insert new content to the innerHTML of an element (top of element):
    suInsertHtmlById( $sMyHtml, 'example_id', $sMyNewContent );    
    
    // Example 3: Add new content ABOVE element:
    suAddHtmlBeforeById( $sMyHtml, 'example_id', $sMyNewContent );    
    
    // Example 3: Add new content BELOW/NEXT TO element:
    suAddHtmlAfterById( $sMyHtml, 'example_id', $sMyNewContent );    
    
    // Example 4: SET new innerHTML content of element:
    suSetHtmlById( $sMyHtml, 'example_id', $sMyNewContent );    
    
    // Example 5: Replace entire element with new content:
    suReplaceHtmlElementById( $sMyHtml, 'example_id', $sMyNewContent );    
    
    // Example 6: Remove entire element:
    suSetHtmlElementById( $sMyHtml, 'example_id' ); 
    
    0 讨论(0)
  • 2020-12-31 14:02

    If the HTML string can be parsed as XML, you can do this (after clearing the element of all child nodes):

    $fragment = $doc->createDocumentFragment();
    $fragment->appendXML($html_string);
    $element->appendChild($fragment);
    

    If $html_string cannot be parsed as XML, it will fail. If it does, you’ll have to use loadHTML(), which is less strict — but it will add elements around the fragment which you will have to strip.

    Unlike PHP, Javascript has the innerHTML property which allows you to do this very easily. I needed something like it for a project so I extended PHP’s DOMElement to include Javascript-like innerHTML access.

    With it you can access the innerHTML property and change it just as you would in Javascript:

    echo $element->innerHTML;
    $elem->innerHTML = '<a href="http://example.org">example</a>';
    

    Source: http://www.keyvan.net/2012/11/php-domdocument-replace-domelement-child-with-html-string/

    0 讨论(0)
  • 2020-12-31 14:03

    You can use loadHTML() on a fragment of code and then append the resulting created nodes into the original DOM tree.

    0 讨论(0)
提交回复
热议问题