In perl
, I can do the following with will pad my punctuation symbols with spaces:
s/([،;؛¿!\"\\])}»›”؟%٪°±©®।॥…])/ $1 /g;`
In
Python version of $1
is \1
, but you should use regex substitution instead of simple string replace:
import re
p = ur'([،;؛¿!"\])}»›”؟%٪°±©®।॥…])'
text = u"this, is a sentence with weird» symbols… appearing everywhere¿"
print re.sub(p, ur' \1 ', text)
Outputs:
this , is a sentence with weird » symbols … appearing everywhere ¿
Use the format
function, and insert a unicode
string:
p = u'،;؛¿!"\])}»›”؟%٪°±©®।॥…'
text = u"this, is a sentence with weird» symbols… appearing everywhere¿"
for i in p:
text = text.replace(i, u' {} '.format(i))
print(text)
Output
this, is a sentence with weird » symbols … appearing everywhere ¿
You can use re.sub, with \1
as a placeholder.
>>> p = u'،;؛¿!"\])}»›”؟%٪°±©®।॥…'
>>> text = u"this, is a sentence with weird» symbols… appearing everywhere¿"
>>> text = re.sub(u'([{}])'.format(p), r' \1 ', text)
>>> print text
this, is a sentence with weird » symbols … appearing everywhere ¿