I\'m trying to create a regular expression where it replaces words which are not enclosed by brackets.
Here is what I currently have:
$this->parse
After another morning of playing with the regex I came up with a quite dirty solution which isn't flexible at all, but works for my use case.
$this->parsed = preg_replace('/\b(?!\[(|((\w+)(\s|\.))|((\w+)(\s|\.)(\w+)(\s|\.))))('.preg_quote($word).')(?!(((\s|\.)(\w+))|((\s|\.)(\w+)(\s|\.)(\w+))|)\[)\b/s','[$10['.implode(",",array_unique($types)).']]',$this->parsed);
What it basically does is check for brackets with no words, 1 word or 2 words in front or behind it in combination with the specified keyword.
Still, it would be great to hear if anyone has a better solution.
You may match any substring inside parentheses with \[[^][]*]
pattern, and then use (*SKIP)(*FAIL) PCRE verbs to drop the match, and only match your pattern in any other context:
\[[^][]*](*SKIP)(*FAIL)|your_pattern_here
See the regex demo. To skip matches inside paired nested square brackets, use a recusrsion-based regex with a subroutine (note it will have to use a capturing group):
(?<skip>\[(?:[^][]++|(?&skip))*])(*SKIP)(*FAIL)|your_pattern_here
See a regex demo
Also, since you are building the pattern dynamically, you need to preg_quote
the $word
along with the delimiter symbol (here, /
).
Your solution is
$this->parsed = preg_replace(
'/\[[^][]*\[[^][]*]](*SKIP)(*FAIL)|\b(?:' . preg_quote($word, '/') . ')\b/',
'[$0[' . implode(",", array_unique($types)) . ']]',
$this->parsed);
The \[[^][]*\[[^][]*]]
regex will match all those occurrences that have been wrapped with your replacement pattern:
\[
- a [
[^][]*
- 0+ chars other than [
and ]
\[
- a [
char[^][]*
- 0+ chars other than [
and ]
]]
- a ]]
substring.