this is a follow up after reading How to specify "Space or end of string" and "space or start of string"?
From there, it states means to match a word
From the regular-expressions.info Word boundaries page:
The metacharacter
\b
is an anchor like the caret and the dollar sign. It matches at a position that is called a "word boundary". This match is zero-length.
There are three different positions that qualify as word boundaries:
- Before the first character in the string, if the first character is a word character.
- After the last character in the string, if the last character is a word character.
- Between two characters in the string, where one is a word character and the other is not a word character.
A very good explanation from nhahtdh post:
A word boundary
\b
is equivalent to:(?:(?
Which means:
Right ahead, there is (at least) a character that is a word character, and right behind, we cannot find a word character (either the character is not a word character, or it is the start of the string).
OR
Right behind, there is (at least) a character that is a word character, and right ahead, we cannot find a word character (either the character is not a word character, or it is the end of the string).
The reason why \b
is not suitable is because it requires a word/non-word character to appear after/before it which depends on the immediate context on both sides of \b
. When you build a regex dynamically, you do not know which one to use, \B
or \b
. For your case, you could use '/\bstackoverflow=""\B/'
, but it would require a smart word/non-word boundary appending. However, there is an easier way: use negative lookarounds.
(?
See regex demo
The regex contains negative lookarounds instead of word boundaries. The (? lookbehind fails the match if there is a word character before
stackoverflow=""
, and (?!\w)
lookahead fails the match if stackoverflow=""
is followed by a word character.
What a word shorthand character class \w
matches depends if you enable the Unicode modifier /u
. Without it, a \w
matches just [a-zA-Z0-9_]
. You can lay further restrictions using the lookarounds.
PHP demo:
$re = '/(?
NOTE: If you pass your string as a variable, remember to escape all special characters in it with preg_quote
:
$re = '/(?
Here, notice the second argument to preg_quote
, which is /
, the regex delimiter char.