Convert a 64 bit integer into 8 separate 1 byte integers in python

落花浮王杯 提交于 2019-11-29 13:15:33
Mark Ransom

In Python 2.x, struct.pack returns a string of bytes. It's easy to convert that to an array of integers.

>>> bytestr = struct.pack('>Q', 2592701575664680400)
>>> bytestr
'#\xfb X\xaa\x16\xbd\xd0'
>>> [ord(b) for b in bytestr]
[35, 251, 32, 88, 170, 22, 189, 208]

The struct module in python is used for converting from python object to byte strings, typically packed according to C structure packing rules. struct.pack takes a format specifier (a string which describes how the bytes of the structure should be laid out), and some python data, and packs it into a byte string. struct.unpack does the inverse, taking a format specifier and a byte string and returning a tuple of unpacked data once again in the format of python objects.

The format specifier being used has two parts. The lead character specifies the endianness (byte order) of the string. The following characters specify the types of the fields of the struct being packed or unpacked. So '>Q' means to pack the given data as a big-endian unsigned long long. To get the bytes in the opposite order, you could use < instead for little-endian.

The final operation is a list comprehension which iterates over the characters of the byte string and uses the ord builtin function to get the integer representation of that character.

Final note: Python doesn't actually have a concept of integer size. In 2.x, there is int which is limited to 32 bits, and long which is of unlimited size. In 3.x those two were unified into a single type. So even though this operation guarantees to give integers that take up only one byte, noting about python will force the resulting integers to stay that way if you use them in other operations.

Solution

Solution without converting the number to a string:

x = 0b0010001111111011001000000101100010101010000101101011111000000000

numbers = list((x >> i) & 0xFF for i in range(0,64,8))
print(numbers)                    # [0, 190, 22, 170, 88, 32, 251, 35]
print(list(reversed(numbers)))    # [35, 251, 32, 88, 170, 22, 190, 0]

Explanation

Here I used list comprehensions, making a loop in increments of 8 over i. So i takes the values 0, 8, 16, 24, 32, 40, 48, 56. Every time, the bitshift operator >> temporarily shifts the number x down by i bits. This is equivalent to dividing by 256^i.

So the resulting number is:

i = 0:   0010001111111011001000000101100010101010000101101011111000000000
i = 8:           00100011111110110010000001011000101010100001011010111110
i = 16:                  001000111111101100100000010110001010101000010110
i = 24:                          0010001111111011001000000101100010101010
i = 32:                                  00100011111110110010000001011000
i = 40:                                          001000111111101100100000
i = 48:                                                  0010001111111011
i = 56:                                                          00100011

By usig & 0xFF, I select the last 8 bits of this number. Example:

x >> 48:           001000111111101100100000
0xff:                              11111111
(x >> 48) & 0xff:  000000000000000000100000

Since the leading zeros do not matter, you have the desired number.

The result is converted to a list and printed in normal and reversed order (like OP wanted it).

Performance

I compared the timing of this result to the other solutions proposed in this thread:

In: timeit list(reversed([(x >> i) & 0xFF for i in range(0,64,8)]))
100000 loops, best of 3: 13.9 µs per loop

In: timeit [(x >> (i * 8)) & 0xFF for i in range(7, -1, -1)]
100000 loops, best of 3: 11.1 µs per loop

In: timeit [(x >> i) & 0xFF for i in range(63,-1,-8)]
100000 loops, best of 3: 10.2 µs per loop

In: timeit reversed(struct.unpack('8B', struct.pack('Q', x)))
100000 loops, best of 3: 3.22 µs per loop

In: timeit reversed(struct.pack('Q', x))
100000 loops, best of 3: 2.07 µs per loop

Result: my solution is not the fastest! Currently, using struct directly (as proposed by Mark Ransom) seems to be the fastest snippet.

bn = "0010001111111011001000000101100010101010000101101011111000000000"

print([int(bn[i:i+8], 2) for i in range(0,len(bn), 8)])
[35, 251, 32, 88, 170, 22, 190, 0]

If you are using the binary representation of n then the output would be different:

n = 2592701575664680373
bn = bin(n)

print([int(bn[i:i+8], 2) for i in range(0,len(bn), 8)])
[35, 251, 32, 88, 170, 22, 189, 181]

Some timings:

In [16]: %%timeit                                                
numbers = list((n >> i) & 0xFF for i in range(0,64,8))
list(reversed(numbers))
   ....: 
100000 loops, best of 3: 2.97 µs per loop

In [17]: timeit [(n >> (i * 8)) & 0xFF for i in range(7, -1, -1)]
1000000 loops, best of 3: 1.73 µs per loop

In [18]: %%timeit                                                
bn = bin(n)
[int(bn[i:i+8], 2) for i in range(0,len(bn), 8)]
   ....: 
100000 loops, best of 3: 3.96 µs per loop

You can also just divmod:

out = []
for _ in range(8):
    n, i = divmod(n, 256)
    out.append(i) 
out = out[::-1]

Which is almost as efficient:

In [31]: %%timeit
   ....: n = 2592701575664680411
   ....: out = []
   ....: for _ in range(8):
   ....:     n, i = divmod(n, 1 << 8)
   ....:     out.append(i)
   ....: out[::-1]
   ....: 
100000 loops, best of 3: 2.35 µs per loop

There is very little advantage in bit shifting with python, I would be more inclined to use whatever you and others find more readable.

Here's a version using struct:

import struct
n = 2592701575664680400
bytes = struct.unpack('8B', struct.pack('Q', n))

The bytes are returned in the opposite order that you showed in your question.

Here are the performance stats:

python -m timeit -s "import struct" "struct.unpack('8B', struct.pack('Q', 2592701575664680400))"
1000000 loops, best of 3: 0.33 usec per loop

On my computer, this is three times faster than the byte-shifting solution.

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!