I have quite large amount of text which include control charachters like \\n \\t and \\r. I need to replace them with a simple space--> \" \". What is the fastest way to do this
I think the fastest way is to use str.translate()
:
import string
s = "a\nb\rc\td"
print s.translate(string.maketrans("\n\t\r", " "))
prints
a b c d
EDIT: As this once again turned into a discussion about performance, here some numbers. For long strings, translate()
is way faster than using regular expressions:
s = "a\nb\rc\td " * 1250000
regex = re.compile(r'[\n\r\t]')
%timeit t = regex.sub(" ", s)
# 1 loops, best of 3: 1.19 s per loop
table = string.maketrans("\n\t\r", " ")
%timeit s.translate(table)
# 10 loops, best of 3: 29.3 ms per loop
That's about a factor 40.