I know that there are many types of space (em space, en space, thin space, non-breaking space, etc), but, all these, that I refered, have HTML entities (at least, PHP\'s htm
$result = preg_replace('/\s/', '', $yourString)
See http://www.php.net/manual/en/regexp.reference.backslash.php for more infos on the \s
\s
by default, will not match whitespace characters with values greater than 128. To get at those, you can instead make good use of other UTF-8-aware sequences.
(Standard disclaimer: I'm skimming the PCRE source code to compile the lists below, I may miss a character or type something incorrectly. Please forgive me.)
\p{Zs}
matches:
\h
(Horizontal whitespace) matches the same as \p{Zs}
above, plus
Similarly for matching vertical whitespace there are a few options.
\p{Zl}
matches U+2028 Line separator.
\p{Zp}
matches U+2029 Paragraph separator.
\v
(Vertical whitespace) matches \p{Zl}
, \p{Zp}
and the following
Going back to the beginning, in UTF-8 mode (i.e. using the u
pattern modifier) \s
will match any character that \p{Z}
matches (which is anything that \p{Zs}
, \p{Zl}
and \p{Zp}
will match), plus
To cut a long story short (I bet you read all of the above, didn't you?) you might want to use \s
but make sure to be in UTF-8 mode like /\s/u
. Putting that to some practical use, to filter out those matching whitespace characters from a string you would do something like
$new_string = preg_replace('/\s/u', '', $old_string);
Finally, if you really, really care about the vertical whitespaces which aren't included in \s
(LF and NEL) then you can use the character class [\s\v]
to match all 26 of the whitespace characters listed above.
They are all plain spaces (returning character code 32) that can be caught with regular expressions or trim().
Try this:
preg_replace("/\s{2,}/", " ", $text);