trim lines and shrink whitespaces using regex for multi line string

社会主义新天地 提交于 2020-05-09 06:26:07

问题


I'm using a php function want to create a function to trim all unnecessary white spaces from a multi line string.

The regex that it's not working is the one that removes spaces at the end:

// Always trim at the end. Warning: this seems to be the costlier
// operation, perhaps because looking ahead is harder?
$patterns[] = ['/ +$/m', ''];

Given the following string from a textarea:

 first  line... abc   //<-- blank space here
 second  is  here... def   //<-- blank space here
 //<-- blank space here
 fourth  line... hi  there   //<-- blank space here

 sith  is  here....   //<-- blank space here

There are blank spaces at the beginning and end of each line plus more than one between the words.

After I run the function:

$functions->trimWhitespace($description, ['blankLines' => false]);

This is what I get:

first line... abc //<-- blank space here
second is here... def //<-- blank space here
//<-- no bank space here
fourth line... hi there //<-- blank space here

sith is here....//<-- no blank space here

Why is it only removing the trailing space from the last line?


回答1:


You may redefine where $ matches using the (*ANYCRLF) verb.

See the following PHP demo:

$s = " ddd    \r\n  bbb     ";
$n = preg_replace('~(*ANYCRLF)\h+$~m', '', $s); // if the string can contain Unicode chars,
echo $n;                                        // also add "u" modifier ('~(*ANYCRLF)\h+$~um')

Details:

  • (*ANYCRLF) - specifies a newline convention: (*CR), (*LF) or (*CRLF)
  • \h+ - 1+ horizontal whitespace chars
  • $ - end of line (now, before CR or LF)
  • ~m - multiline mode on ($ matches at the end of a line).

If you want to allow $ to match at any Unicode line breaks, replace (*ANYCRLF) with (*ANY).

See Newline conventions in the PCRE reference:

(*CR)        carriage return
(*LF)        linefeed
(*CRLF)      carriage return, followed by linefeed
(*ANYCRLF)   any of the three above
(*ANY)       all Unicode newline sequences

Now, if you need to

  • Trim the lines from both start and end
  • Shrink whitespaces inside the lines into just a single space

use

$s = " Ł    ę  d    \r\n  Я      ёb     ";
$n = preg_replace('~(*ANYCRLF)^\h+|\h+$|(\h){2,}~um', '$1', $s);
echo $n;

See the PHP demo.




回答2:


Use a two step approach:

<?php

$text = " first  line... abc   
 second  is  here... def   
  <-- blank space here
 fourth  line... hi  there   

 sith  is  here....   ";

// get rid of spaces at the beginning and end of line
$regex = '~^\ +|\ +$~m';
$text = preg_replace($regex, '', $text);

 // get rid of more than two consecutive spaces
$regex = '~\ {2,}~';
$text = preg_replace($regex, ' ', $text);
echo $text;

?>

See a demo on ideone.com.




回答3:


You need to /gm instead of just /m

The code should become: (this code won't work, the update one will)

$patterns[] = ['/ +$/mg', ''];

Working example here: https://regex101.com/r/z3pDre/1

Update:

The g identifier, don't work like this. We need to replace preg_match with preg_match_all

Use the regex without g, like this:

$patterns[] = ['/ +$/m', ''];



回答4:


preg_replace ( mixed $pattern , mixed $replacement , mixed $subject [, int $limit = -1 [, int &$count ]] )

so you want preg_replace('/[\s]+$/m', '', $string)




回答5:


 preg_replace('/*(.*) +?\n*$/', $content)

Live Demo



来源:https://stackoverflow.com/questions/42045571/trim-lines-and-shrink-whitespaces-using-regex-for-multi-line-string

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!