How to remove entire div with preg_replace

前端未结

关注

 4  1375

Ok, as it is WordPress problem and it sadly goes a little deeper, I need to remove each representation of parent div and its inside:



        
                      
              相关标签:


      
      
        
          4条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  野趣味        
                
              
                            
                2021-01-16 17:27
              
            
            
                                                                       
I wouldn't use a regular expression. Instead, I would use the DOMDocument class. Just find all of the div elements with that class, and remove them from their parent(s):

$html = "<p>Hello World</p>
         <div class='sometestclass'>
           <img src='foo.png'/>
           <div>Bar</div>
         </div>";

$dom = new DOMDocument;
$dom->loadHTML( $html );

$xpath = new DOMXPath( $dom );
$pDivs = $xpath->query(".//div[@class='sometestclass']");

foreach ( $pDivs as $div ) {
  $div->parentNode->removeChild( $div );
}

echo preg_replace( "/.*<body>(.*)<\/body>.*/s", "$1", $dom->saveHTML() );


Which results in:

<p>Hello World</p>

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  死守一世寂寞        
                
              
                            
                2021-01-16 17:28
              
            
            
                                                                       
How about just some CSS .sometestclass{display: none;} ?
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  温柔的废话        
                
              
                            
                2021-01-16 17:28
              
            
            
                                                                       
For the UTF-8 issue, I found a hack at the PHP-manual

So my functions looks as follows:

function rem_fi_cat() {
/* This function removes images from _within_ the article.
 * If these images are enclosed in a "wp-caption" div-tag.
 * If the articles are post formatted as "image".
 * Only on home-page, front-page an in category/archive-pages.
 */
if ( (is_home() || is_front_page() || is_category()) && has_post_format( 'image' ) ) {
    $document = new DOMDocument();
    $content = get_the_content( '', true );
    if( '' != $content ) {
        /* incl. UTF-8 "hack" as described at 
         * http://www.php.net/manual/en/domdocument.loadhtml.php#95251
         */
        $document->loadHTML( '<?xml encoding="UTF-8">' . $content );
        foreach ($doc->childNodes as $item) {
            if ($item->nodeType == XML_PI_NODE) {
                $doc->removeChild($item); // remove hack
                $doc->encoding = 'UTF-8'; // insert proper
            }
        }
        $xpath = new DOMXPath( $document );
        $pDivs = $xpath->query(".//div[@class='wp-caption']");

        foreach ( $pDivs as $div ) {
            $div->parentNode->removeChild( $div );
        }

        echo preg_replace( "/.*<div class=\"entry-container\">(.*)<\/div>.*/s", "$1", $document->saveHTML() );

    }
}


}
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  囚心锁ツ        
                
              
                            
                2021-01-16 17:50
              
            
            
                                                                       
<?php $content = preg_replace('/<div class="sometestclass">.*?<\/div><!-- END: .sometestclass -->/s','',$content); ?>


My RegEx is a bit rusty, but I think this should work. Do note that, as others have said, RegEx is not properly equipped to handle some of the complexities of HTML. 

In addition, this pattern won't find embedded div elements with the class sometestclass. You would need recursion for that.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复