Process escape sequences in a string in Python

前端未结

关注

 6  1263

隐瞒了意图╮ 2020-11-22 03:56

Sometimes when I get input from a file or the user, I get a string with escape sequences in it. I would like to process the escape sequences in the same way that Python proc

6条回答

太阳男子 (楼主)

2020-11-22 04:51
The actually correct and convenient answer for python 3:
```
>>> import codecs
>>> myString = "spam\\neggs"
>>> print(codecs.escape_decode(bytes(myString, "utf-8"))[0].decode("utf-8"))
spam
eggs
>>> myString = "naïve \\t test"
>>> print(codecs.escape_decode(bytes(myString, "utf-8"))[0].decode("utf-8"))
naïve    test
```
Details regarding codecs.escape_decode:
- codecs.escape_decode is a bytes-to-bytes decoder
- codecs.escape_decode decodes ascii escape sequences, such as: b"\\n" -> b"\n", b"\\xce" -> b"\xce".
- codecs.escape_decode does not care or need to know about the byte object's encoding, but the encoding of the escaped bytes should match the encoding of the rest of the object.
Background:
- @rspeer is correct: unicode_escape is the incorrect solution for python3. This is because unicode_escape decodes escaped bytes, then decodes bytes to unicode string, but receives no information regarding which codec to use for the second operation.
- @Jerub is correct: avoid the AST or eval.
- I first discovered codecs.escape_decode from this answer to "how do I .decode('string-escape') in Python3?". As that answer states, that function is currently not documented for python 3.
0 讨论(0)

查看其它6个回答
发布评论:

提交评论
- 加载中...