I have this simple pattern that splits a text into periods:
$text = preg_split(\"/[\\.:!\\?]+/\", $text);
But I want to include . :
Here you go:
preg_split('/([^.:!?]+[.:!?]+)/', 'good:news.everyone!', -1, PREG_SPLIT_DELIM_CAPTURE | PREG_SPLIT_NO_EMPTY);
How it works: The pattern actually turns everything into a delimiter. Then, to include these delimiters in the array, you can use the PREG_SPLIT_DELIM_CAPTURE
constant. This will return an array like:
array (
0 => '',
1 => 'good:',
2 => '',
3 => 'news.',
4 => '',
5 => 'everyone!',
6 => '',
);
To get rid of the empty values, use PREG_SPLIT_NO_EMPTY
. To combine two or more of these constants, we use the bitwise |
operator. The result:
array (
0 => 'good:',
1 => 'news.',
2 => 'everyone!'
);
No use for PREG_SPLIT_DELIM_CAPTURE
if you use a positive lookbehind in your pattern. The function will keep the delimiters.
$text = preg_split('/(?<=[.:!?])/', 'good:news.everyone!', 0, PREG_SPLIT_NO_EMPTY);
If you use lookbehind
, it will just look for the character without matching it. So, in the case of preg_split()
, the function will not discard the character.
The result without PREG_SPLIT_NO_EMPTY
flag:
array (
0 => 'good:',
1 => 'news.',
2 => 'everyone!',
3 => ''
);
The result with PREG_SPLIT_NO_EMPTY
flag:
array (
0 => 'good:',
1 => 'news.',
2 => 'everyone!'
);
You can test it using this PHP Online Function Tester.