问题
How can I match all the “special” chars (like +_*&^%$#@!~
) except the char -
in PHP?
I know that \W
will match all the “special” chars including the -
.
Any suggestions in consideration of Unicode letters?
回答1:
[^-]
is not the special character you want[\W]
are all special characters as you know[^\w]
are all special characters as well - sounds fair?
So therefore [^\w-]
is the combination of both: All "special" characters but without -
.
回答2:
\pL
matches any character with the UnicodeLetter
character property, which is a major general category group; that is, it matches[\p{Ll}\p{Lt}\p{Lu}\p{Lm}\p{Lo}]
.\pN
matches any character with the UnicodeNumber
character property, which is a major general category group; that is, it matches[\p{Nd}\p{Nl}\p{No}]
.- Note that the Unicode
Alphabetic
characterproperty also includes certain combining marks such as U+0345 ◌ͅ ᴄᴏᴍʙɪɴɪɴɢ ɢʀᴇᴇᴋ ʏᴘᴏɢᴇɢʀᴀᴍᴍᴇɴɪ. I suggest you that you also include\pM
, which matches any character with the UnicodeMark
character property, which is a major general category group; that is, it matches[\p{Mn}\p{Me}\p{Mc}]
. - Character U+002D ʜʏᴘʜᴇɴ-ᴍɪɴᴜꜱ is probably the
-
you’re referring to. - Note though that Unicode v6.1 has 27 characters with the Unicode
Dash
character property, including such common characters as U+2010 ʜʏᴘʜᴇɴ, U+2013 ᴇɴ ᴅᴀꜱʜ, U+2014 ᴇᴍ ᴅᴀꜱʜ, and U+2212 ᴍɪɴᴜꜱ ꜱɪɢɴ. Whether you actually want to include or exclude those, I have no idea.
Given all that, it is not unlikely that you want something like:
[^\pL\pN\pM\x2D\x{2010}-\x{2015}\x{2212}]
回答3:
You can try this pattern
([^a-zA-Z-])
This should match all characters that are not a-z
and the -
来源:https://stackoverflow.com/questions/9727097/how-to-match-with-regex-all-special-chars-except-in-php