问题
i need to load the third column of this text file as a hex string
http://www.netmite.com/android/mydroid/1.6/external/skia/emoji/gmojiraw.txt
>>> open('gmojiraw.txt').read().split('\n')[0].split('\t')[2]
'\\xF3\\xBE\\x80\\x80'
how do i open the file so that i can get the third column as hex string:
'\xF3\xBE\x80\x80'
i also tried binary mode and hex mode, with no success.
回答1:
You can:
- Remove the
\x
-es - Use .decode('hex') on the resulting string
Code:
>>> '\\xF3\\xBE\\x80\\x80'.replace('\\x', '').decode('hex')
'\xf3\xbe\x80\x80'
Note the appropriate interpretation of backslashes. When the string representation is '\xf3' it means it's a single-byte string with the byte value 0xF3. When it's '\\xf3', which is your input, it means a string consisting of 4 characters: \
, x
, f
and 3
回答2:
Quick'n'dirty reply
your_string.decode('string_escape')
>>> a='\\xF3\\xBE\\x80\\x80'
>>> a.decode('string_escape')
'\xf3\xbe\x80\x80'
>>> len(_)
4
Bonus info
>>> u='\uDBB8\uDC03'
>>> u.decode('unicode_escape')
Some trivia
What's interesting, is that I have Python 2.6.4 on Karmic Koala Ubuntu (sys.maxunicode==1114111
) and Python 2.6.5 on Gentoo (sys.maxunicode==65535
); on Ubuntu, the unicode_escape-decode result is \uDBB8\uDC03
and on Gentoo it's u'\U000fe003'
, both correctly of length 2. Unless it's something fixed between 2.6.4 and 2.6.5, I'm impressed the 2-byte-per-unicode-character Gentoo version reports the correct character.
回答3:
If you are using Python2.6+ here is a safe way to use eval
>>> from ast import literal_eval
>>> item='\\xF3\\xBE\\x80\\x80'
>>> literal_eval("'%s'"%item)
'\xf3\xbe\x80\x80'
回答4:
If you trust the source, you can use eval('"%s"' % data)
回答5:
After stripping out the "\x" as Eli's answer, you can just do:
int("F3BE8080",16)
来源:https://stackoverflow.com/questions/3519125/converting-a-hex-string-representation-to-actual-bytes-in-python