It seems like there should be a simpler way than:
import string
s = \"string. With. Punctuation?\" # Sample string
out = s.translate(string.maketrans(\"\",\
For Python 3 str
or Python 2 unicode
values, str.translate() only takes a dictionary; codepoints (integers) are looked up in that mapping and anything mapped to None
is removed.
To remove (some?) punctuation then, use:
import string
remove_punct_map = dict.fromkeys(map(ord, string.punctuation))
s.translate(remove_punct_map)
The dict.fromkeys() class method makes it trivial to create the mapping, setting all values to None
based on the sequence of keys.
To remove all punctuation, not just ASCII punctuation, your table needs to be a little bigger; see J.F. Sebastian's answer (Python 3 version):
import unicodedata
import sys
remove_punct_map = dict.fromkeys(i for i in range(sys.maxunicode)
if unicodedata.category(chr(i)).startswith('P'))