Sometimes when I get input from a file or the user, I get a string with escape sequences in it. I would like to process the escape sequences in the same way that Python proc
The actually correct and convenient answer for python 3:
>>> import codecs
>>> myString = "spam\\neggs"
>>> print(codecs.escape_decode(bytes(myString, "utf-8"))[0].decode("utf-8"))
spam
eggs
>>> myString = "naïve \\t test"
>>> print(codecs.escape_decode(bytes(myString, "utf-8"))[0].decode("utf-8"))
naïve test
Details regarding codecs.escape_decode
:
codecs.escape_decode
is a bytes-to-bytes decodercodecs.escape_decode
decodes ascii escape sequences, such as: b"\\n"
-> b"\n"
, b"\\xce"
-> b"\xce"
.codecs.escape_decode
does not care or need to know about the byte object's encoding, but the encoding of the escaped bytes should match the encoding of the rest of the object.Background:
unicode_escape
is the incorrect solution for python3. This is because unicode_escape
decodes escaped bytes, then decodes bytes to unicode string, but receives no information regarding which codec to use for the second operation.codecs.escape_decode
from this answer to "how do I .decode('string-escape') in Python3?". As that answer states, that function is currently not documented for python 3.