问题
Why does the letter é
count as a word boundary matching \b
in the following example?
Pattern: /\b(cum)\b/i
Text: écumé
Matches 'cum' which is not desired.
Is it possible to overcome this?
回答1:
It will work, when you add the u
modifier to your regex
/\b(cum)\b/iu
回答2:
To deal with unicode, replace \b
with
/(?<=^|\PL)(cum)(?=\PL|$)/i
来源:https://stackoverflow.com/questions/22068702/regular-expression-pcre-php-word-boundary-b-and-accent-characters