Remove whitespace from HTML

前端未结
关注
 15  1725
I have HTML code like:

    
        
            
                   


        
                      
              相关标签:
       
      
      
      
        
          15条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  深忆病人        
                
              
                            
                2020-12-28 14:46
              
            
            
                                                                       
Thank you for posting this question. The problem is indeed dealing with whitespace bugs in certain environments. While the regex solution works in the general case, for a quick hack remove leading whitespace and add  tags to the end of each line. PHP removes the newline following a closing ?>. E.g.:

<ul><?php ?>
<li><a id="nav-questions" href="/questions">Questions</a></li><?php ?>
<li><a id="nav-tags" href="/tags">Tags</a></li><?php ?>
<li><a id="nav-users" href="/users">Users</a></li><?php ?>
<li><a id="nav-badges" href="/badges">Badges</a></li><?php ?>
<li><a id="nav-unanswered" href="/unanswered">Unanswered</a></li><?php ?>
</ul>


Obviously this is sub-optimal for a variety of reasons, but it'll work for a localized problem without affecting the entire tool chain.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  南旧        
                
              
                            
                2020-12-28 14:47
              
            
            
                                                                       
Use regular expressions, like:

>(\s).*?<

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  温柔的废话        
                
              
                            
                2020-12-28 14:48
              
            
            
                                                                       
if you got 8 bit ASCII, is will remove them and keep the chars in range 128-255

 $text = preg_replace('/[\x00-\x1F\xFF]/', " ", $text );


If you have a UTF-8 encoded string is will do the work

$text = preg_replace('/[\x00-\x1F\x7F]/u', '', $text);


for more information 
you have this link
more information
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  清歌不尽        
                
              
                            
                2020-12-28 14:48
              
            
            
                                                                       
<?php
    define(COMPRESSOR, 1);

        function remove_html_comments($content = '') {
            return preg_replace('/<!--(.|\s)*?-->/', '', $content);
        }
        function sanitize_output($buffer) {
            $search = array(
                '/\>[^\S ]+/s',  // strip whitespaces after tags, except space
            '/[^\S ]+\</s',  // strip whitespaces before tags, except space
            '/(\s)+/s'       // shorten multiple whitespace sequences
          );

          $replace = array(
             '>',
             '<',
             '\\1'
          );

          $buffer = preg_replace($search, $replace, $buffer);
          return remove_html_comments($buffer);
        }
        if(COMPRESSOR){ ob_start("sanitize_output"); }
    ?>

    <html>  
        <head>
          <!-- comment -->
          <title>Example   1</title>
        </head>
        <body>
           <p>This is       example</p>
        </body>
    </html>


    RESULT: <html><head><title>Example 1</title></head><body><p>This is example</p></body></html> 

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  天命终不由人        
                
              
                            
                2020-12-28 14:52
              
            
            
                                                                       
I can't delete this answer but it's no longer relevant, the web landscape has changed so much in 8 years that this has become useless.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  花落未央        
                
              
                            
                2020-12-28 14:52
              
            
            
                                                                       
A RegEx replace could do the trick, something like:

$result = preg_replace('!\s+!smi', ' ', $content);

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
   
          
     上一页
1
2
3
下一页
           
           
        
                                  
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复
            
          
        
      
       
      
    
    
          
 
     
 
        热议问题