I have regex that matches words fine except if they contain a special character such as ~Query which is the name of a member of a C++ class. Need to use word boundary as sh
\b
is short for
(?:(?<!\w)(?=\w)|(?<=\w)(?!\w))
If you want to treat ~
as a word character, change \w
to [\w~]
.
(?:(?<![\w~])(?=[\w~])|(?<=[\w~])(?![\w~]))
Example usage:
my $word_char = qr/[\w~]/;
my $boundary = qr/(?<!$word_char)(?=$word_char)
|(?<=$word_char)(?!$word_char)/x;
$key =~ /$boundary$match$boundary/
If we know $match
can only match something that starts and ends with a $word_char
, we can simplify as follows:
my $word_char = qr/[\w~]/;
my $start_bound = qr/(?<!$word_char)/;
my $end_bound = qr/(?!$word_char)/;
$key =~ /$start_bound$match$end_bound/
This is simple enough that we can inline.
$key =~ /(?<![\w~])$match(?![\w~])/
Assuming you don't need to check the contents of $match
(i.e. it always contains a valid identifier) you can write this
$key =~ /(?<![~\w])$match(?![~\w])/
which simply checks that the string in $match
isn't preceded or followed by alphanumerics, underscores or tildes