removing new lines except in

后端 未结 3 585
温柔的废话
温柔的废话 2021-01-14 19:24

I want to remove new lines from some html (with php) except in

 tags where whitespace is obviously important.

相关标签:
3条回答
  • Split the content up. This is easily done with...

    $blocks = preg_split('/<(|\/)pre>/', $html);
    

    Just be careful, because the $blocks elements won't contain the pre opening and closing tags. I feel that assume the HTML is valid is acceptable, and therefore you can expect the pre-blocks to be every other element in the array (1, 3, 5, ...). Easily tested with $i % 2 == 1.

    Example "complete" script (modify as you need to)...

    <?php
    //out example HTML file - could just as easily be a read in file
    $html = <<<EOF
    <html>
      <head>
        <title>test</title>
      </head>
      <body>
        <h1>Title</h1>
        <p>
          This is an article about...
        </p>
        <pre>
          line one
          line two
          line three
        </pre>
        <div style="float: right:">
          random
        </div>
        </body>
    </html>
    EOF;
    
    //break it all apart...
    $blocks = preg_split('/<(|\/)pre>/', $html);
    
    //and put it all back together again
    $html = ""; //reuse as our buffer
    foreach($blocks as $i => $block)
    {
      if($i % 2 == 1)
        $html .= "\n<pre>$block</pre>\n"; //break out <pre>...</pre> with \n's
      else 
        $html .= str_replace(array("\n", "\r"), "", $block, $c);
    }
    
    echo $html;
    ?>
    
    0 讨论(0)
  • 2021-01-14 20:13

    It may be 3 years later, but... The following code will remove all line breaks and whitespace at long as it is outside of pre tags. Cheers!

    function sanitize_output($buffer)
    {
        $search = array(
            '/\>[^\S ]+/s', //strip whitespaces after tags, except space
            '/[^\S ]+\</s', //strip whitespaces before tags, except space
            '/(\s)+/s'  // shorten multiple whitespace sequences
            );
        $replace = array(
            '>',
            '<',
            '\\1'
            );
    
        $blocks = preg_split('/(<\/?pre[^>]*>)/', $buffer, null, PREG_SPLIT_DELIM_CAPTURE);
        $buffer = '';
        foreach($blocks as $i => $block)
        {
          if($i % 4 == 2)
            $buffer .= $block; //break out <pre>...</pre> with \n's
          else 
            $buffer .= preg_replace($search, $replace, $block);
        }
    
        return $buffer;
    }
    
    ob_start("sanitize_output");
    
    0 讨论(0)
  • 2021-01-14 20:17

    If the html is well formed, you can rely on the fact that <pre> tags aren't allowed to be nested. Make two passes: First you split the input into block of pre tags and everything else. You can use a regular expression for this task. Then you strip new lines from each non-pre block, and finally join them all back together.

    Note that most html isn't well formed, so this approach may have some limits to where you can use it.

    0 讨论(0)
提交回复
热议问题