问题
I have a string in hex:
Hex = 'E388854083969497A4A38599408881A2409985829696A38584408699969440814082A48783888583924B'
As a byte object it looks like this:
b'\xe3\x88\x85@'b'\xe3\x88\x85@\x83\x96\x94\x97\xa4'b'\xe3\x88\x85@'b'\xe3\x88\x85@\x83\x96\x94\x97\xa4'b'\xe3\x88\x85@\x83'b'\xe3\x88'b'\xe3\x88\x85@\x83\x96\x94\x97\xa4'
In EBCDIC it is this:
The computer has rebooted from a bugcheck.
So I know that hex 40 (x40) is a 'space' in EBCDIC and its a '@' in ASCII
I can't figure why python, when printing the byte objects, prints '@' instead of '\x40'
my test code sample is:
import codecs
Hex = 'E388854083969497A4A38599408881A2409985829696A38584408699969440814082A48783888583924B'
output = []
DDF = [4,9,4,9,5,2,9]
distance = 0
# This breaks my hex string into chunks based off the list 'DDF'
for x in DDF:
output.append(Hex[distance:x*2+distance])
distance += x*2
#This prints out the list of hex strings
for x in output:
print(x)
#This prints out they byte objects in the list
for x in output:
x = codecs.decode(x, "hex")
print(x)
#The next line print the correct text
Hex = codecs.decode(Hex, "hex")
print(codecs.decode(Hex, 'cp1140'))
The Output of the above is :
E3888540
83969497A4A3859940
8881A240
9985829696A3858440
8699969440
8140
82A48783888583924B
b'\xe3\x88\x85@'
b'\x83\x96\x94\x97\xa4\xa3\x85\x99@'
b'\x88\x81\xa2@'
b'\x99\x85\x82\x96\x96\xa3\x85\x84@'
b'\x86\x99\x96\x94@'
b'\x81@'
b'\x82\xa4\x87\x83\x88\x85\x83\x92K'
The computer has rebooted from a bugcheck.
So I guess my question is how can I get python to print the byte object as 'x40' instead of '@'
Thank you so much for your help :)
回答1:
I think your byte array is slightly off.
According to this, you need to use 'cp500' for decoding, example:
my_string_in_hex = 'E388854083969497A4A38599408881A2409985829696A38584408699969440814082A48783888583924B'
my_bytes = bytearray.fromhex(my_string_in_hex)
print(my_bytes)
my_string = my_bytes.decode('cp500')
print(my_string)
output:
bytearray(b'\xe3\x88\x85@\x83\x96\x94\x97\xa4\xa3\x85\x99@\x88\x81\xa2@\x99\x85\x82\x96\x96\xa3\x85\x84@\x86\x99\x96\x94@\x81@\x82\xa4\x87\x83\x88\x85\x83\x92K')
The computer has rebooted from a bugcheck.
When you print the bytearray, it will still print a '@', however it is actuall \x40 "under the covers". This is just the __repr__() of the object. As this method is not taking any "decode" parameter to decode it properly, it just creates a "readable" string for printing purposes.
__repr__()
or repr()
is "just that"; it is only a "representation of the object" not the actual object. This does not mean it is actually a '@'. I just uses that character when printing. It is still a bytearray, not a string.
When decoding it will properly decode, using the code-page selected.
回答2:
Python always tries to first decode hex as a printable (read: ASCII) character when printing via print()
. If you need a full hex string printed use binascii.hexlify()
:
Hex = 'E388854083969497A4A38599408881A2409985829696A38584408699969440814082A48783888583924B'
binascii.hexlify(codecs.decode(Hex,'hex'))
>>>> b'e388854083969497a4a38599408881a2409985829696a38584408699969440814082a48783888583924b'
来源:https://stackoverflow.com/questions/49012503/python-byte-representation-of-a-hex-string-that-is-ebcdic