In perl
, I can do the following with will pad my punctuation symbols with spaces:
s/([،;؛¿!\"\\])}»›”؟%٪°±©®।॥…])/ $1 /g;`
In
Python version of $1
is \1
, but you should use regex substitution instead of simple string replace:
import re
p = ur'([،;؛¿!"\])}»›”؟%٪°±©®।॥…])'
text = u"this, is a sentence with weird» symbols… appearing everywhere¿"
print re.sub(p, ur' \1 ', text)
Outputs:
this , is a sentence with weird » symbols … appearing everywhere ¿