The string input comes from textarea where users are supposed to enter every single item on a new line.
When processing the form, it is easy to explode the textarea
You can normalize the input:
<?php
$foo = strtr($foo, array(
"\r\n" => "\n",
"\r" => "\n",
"\n" => "\n",
));
?>
Alternatively, you can explode with regular expressions:
<?php
$foo = preg_split ("/[\r\n]+/", $foo);
?>
$arr = preg_split( "/[\n\r]+/", $textareaInput );
'\r' by itself as a line terminator is an old convention that's not really used anymore (not since OSX which is Unix based).
Your explode
will be fine. Just trim
off the '\r' in each resulting element for the Windows users.
You can use preg_split to do that.
$arr = preg_split('/[\r\n]+/', $textareaInput);
It splits it on any combination of the \r
or \n
characters. You can also use \s
to include any white-space char.
Edit
It occurred to me, that while the previous code works fine, it also removes empty lines. If you want to preserve the empty lines, you may want to try this instead:
$arr = preg_split('/(\r\n|[\r\n])/', $textareaInput);
It basically starts by looking for the Windows version \r\n
, and if that fails it looks for either the old Mac version \r
or the Unix version \n
.
For example:
<?php
$text = "Windows\r\n\r\nMac\r\rUnix\n\nDone!";
$arr = preg_split('/(\r\n|[\r\n])/', $text);
print_r($arr);
?>
Prints:
Array
(
[0] => Windows
[1] =>
[2] => Mac
[3] =>
[4] => Unix
[5] =>
[6] => Done!
)
Following code must do the job
<?php
$split = preg_split('/[\r\n]+/', $src);
foreach ($split as $k=>$string) {
$split[$k] = trim($string);
if (empty($split[$k]))
unset($split[$k]);
}
ksort($split);
$join = implode('', $split);
?>
to get string with newlinews completely stripped. It won't work correctly with JS though :(
The system agnostic technique with regex is involves the \R
escape sequence.
PHP Documentation on Escape Sequences
It really is as simple as calling preg_split('~\R~', $textareaInput)
.
\R - line break: matches \n, \r and \r\n
Normalizing the input is a waste of time and effort if you are just going to explode on the replacement characters anyhow.
If you are worried about multiple consecutive newline characters in the string, you can just add the +
quantifier afer \R
.
If you want to trim whitespace characters from both sides of the strings in the resultant array, you can use ~\s*\R\s*~