问题
In python, I have been given a 64 bit integer. This Integer was created by taking several different 8 bit integers and mashing them together into one giant 64 bit integer. It is my job to separate them again.
For example:
Source number: 2592701575664680400
Binary (64 bits): 0010001111111011001000000101100010101010000101101011111000000000
int 1: 00100011 (35)
int 2: 11111011 (251)
int 3: 00100000 (32)
int 4: 01011000 (88)
int 5: 10101010 (170)
int 6: 00010110 (22)
int 7: 10111110 (190)
int 8: 00000000 (0)
So what I would like to do is take my source number 2592701575664680373
and return an array of length 8, where each int in the array are the ints listed above.
I was going to use struct
, but to be perfectly honest, reading the documentation hasn't made it quite clear exactly how I would accomplish that.
回答1:
In Python 2.x, struct.pack
returns a string of bytes. It's easy to convert that to an array of integers.
>>> bytestr = struct.pack('>Q', 2592701575664680400)
>>> bytestr
'#\xfb X\xaa\x16\xbd\xd0'
>>> [ord(b) for b in bytestr]
[35, 251, 32, 88, 170, 22, 189, 208]
The struct
module in python is used for converting from python object to byte strings, typically packed according to C structure packing rules. struct.pack
takes a format specifier (a string which describes how the bytes of the structure should be laid out), and some python data, and packs it into a byte string. struct.unpack
does the inverse, taking a format specifier and a byte string and returning a tuple of unpacked data once again in the format of python objects.
The format specifier being used has two parts. The lead character specifies the endianness (byte order) of the string. The following characters specify the types of the fields of the struct being packed or unpacked. So '>Q'
means to pack the given data as a big-endian unsigned long long
. To get the bytes in the opposite order, you could use <
instead for little-endian.
The final operation is a list comprehension which iterates over the characters of the byte string and uses the ord
builtin function to get the integer representation of that character.
Final note: Python doesn't actually have a concept of integer size. In 2.x, there is int
which is limited to 32 bits, and long
which is of unlimited size. In 3.x those two were unified into a single type. So even though this operation guarantees to give integers that take up only one byte, noting about python will force the resulting integers to stay that way if you use them in other operations.
回答2:
Solution
Solution without converting the number to a string:
x = 0b0010001111111011001000000101100010101010000101101011111000000000
numbers = list((x >> i) & 0xFF for i in range(0,64,8))
print(numbers) # [0, 190, 22, 170, 88, 32, 251, 35]
print(list(reversed(numbers))) # [35, 251, 32, 88, 170, 22, 190, 0]
Explanation
Here I used list comprehensions, making a loop in increments of 8 over i
. So i
takes the values 0, 8, 16, 24, 32, 40, 48, 56
.
Every time, the bitshift operator >>
temporarily shifts the number x
down by i
bits. This is equivalent to dividing by 256^i
.
So the resulting number is:
i = 0: 0010001111111011001000000101100010101010000101101011111000000000
i = 8: 00100011111110110010000001011000101010100001011010111110
i = 16: 001000111111101100100000010110001010101000010110
i = 24: 0010001111111011001000000101100010101010
i = 32: 00100011111110110010000001011000
i = 40: 001000111111101100100000
i = 48: 0010001111111011
i = 56: 00100011
By usig & 0xFF
, I select the last 8 bits of this number. Example:
x >> 48: 001000111111101100100000
0xff: 11111111
(x >> 48) & 0xff: 000000000000000000100000
Since the leading zeros do not matter, you have the desired number.
The result is converted to a list and printed in normal and reversed order (like OP wanted it).
Performance
I compared the timing of this result to the other solutions proposed in this thread:
In: timeit list(reversed([(x >> i) & 0xFF for i in range(0,64,8)]))
100000 loops, best of 3: 13.9 µs per loop
In: timeit [(x >> (i * 8)) & 0xFF for i in range(7, -1, -1)]
100000 loops, best of 3: 11.1 µs per loop
In: timeit [(x >> i) & 0xFF for i in range(63,-1,-8)]
100000 loops, best of 3: 10.2 µs per loop
In: timeit reversed(struct.unpack('8B', struct.pack('Q', x)))
100000 loops, best of 3: 3.22 µs per loop
In: timeit reversed(struct.pack('Q', x))
100000 loops, best of 3: 2.07 µs per loop
Result: my solution is not the fastest!
Currently, using struct
directly (as proposed by Mark Ransom) seems to be the fastest snippet.
回答3:
bn = "0010001111111011001000000101100010101010000101101011111000000000"
print([int(bn[i:i+8], 2) for i in range(0,len(bn), 8)])
[35, 251, 32, 88, 170, 22, 190, 0]
If you are using the binary representation of n then the output would be different:
n = 2592701575664680373
bn = bin(n)
print([int(bn[i:i+8], 2) for i in range(0,len(bn), 8)])
[35, 251, 32, 88, 170, 22, 189, 181]
Some timings:
In [16]: %%timeit
numbers = list((n >> i) & 0xFF for i in range(0,64,8))
list(reversed(numbers))
....:
100000 loops, best of 3: 2.97 µs per loop
In [17]: timeit [(n >> (i * 8)) & 0xFF for i in range(7, -1, -1)]
1000000 loops, best of 3: 1.73 µs per loop
In [18]: %%timeit
bn = bin(n)
[int(bn[i:i+8], 2) for i in range(0,len(bn), 8)]
....:
100000 loops, best of 3: 3.96 µs per loop
You can also just divmod:
out = []
for _ in range(8):
n, i = divmod(n, 256)
out.append(i)
out = out[::-1]
Which is almost as efficient:
In [31]: %%timeit
....: n = 2592701575664680411
....: out = []
....: for _ in range(8):
....: n, i = divmod(n, 1 << 8)
....: out.append(i)
....: out[::-1]
....:
100000 loops, best of 3: 2.35 µs per loop
There is very little advantage in bit shifting with python, I would be more inclined to use whatever you and others find more readable.
回答4:
Here's a version using struct
:
import struct
n = 2592701575664680400
bytes = struct.unpack('8B', struct.pack('Q', n))
The bytes
are returned in the opposite order that you showed in your question.
Here are the performance stats:
python -m timeit -s "import struct" "struct.unpack('8B', struct.pack('Q', 2592701575664680400))"
1000000 loops, best of 3: 0.33 usec per loop
On my computer, this is three times faster than the byte-shifting solution.
来源:https://stackoverflow.com/questions/32490081/convert-a-64-bit-integer-into-8-separate-1-byte-integers-in-python