问题
In perl s/[^\w:]//g
would replace all non alphanumeric characters EXCEPT :
In python I'm using re.sub(r'\W+', '',mystring)
which does remove all non alphanumeric except _ underscore.
Is there any way to put exceptions, I wish not to replace signs like = and .
Previously I was applying the other approach i.e. to replace all unwanted characters usingre.sub('[!@#\'\"
$()]', '',mystring`)
However, it is not possible for me to predict what all characters may come in mystring hence I wish to remove all non alphanumeric characters except a few.
Google didnt provide an appropriate answer. The closest search being python regex split any \W+ with some exceptions but this didnt help me either.
回答1:
You can specify everything that you need not remove in the negated character clas.
re.sub(r'[^\w'+removelist+']', '',mystring)
Test
>>> import re
>>> removelist = "=."
>>> mystring = "asdf1234=.!@#$"
>>> re.sub(r'[^\w'+removelist+']', '',mystring)
'asdf1234=.'
Here the removelist
variable is a string which contains the list of all characters you need to exclude from the removal.
What does negated character class means
When the ^
is moved into the character class it does not acts as an anchor where as it negates the character class.
That is ^
in inside a character class say like [^abc]
it negates the meaning of the character class.
For example [abc]
will match a
b
or c
where as [^abc]
will not match a
b
or c
. Which can also be phrased as anything other than a
b
or c
回答2:
re.sub(r'[^a-zA-Z0-9=]', '',mystring)
You can add whatever you want like _
whichever you want to save.
回答3:
I believe the approach you describe in perl could also be used in python, eg:
re.sub(r'[^\w=]', '',mystring)
would remove everything except word-characters and =
来源:https://stackoverflow.com/questions/27938765/replace-non-alphanumeric-characters-except-some-exceptions-python