问题
Is there a shortcut to Convert binary (0|1) numpy array to integer or binary-string ? F.e.
b = np.array([0,0,0,0,0,1,0,1])
=> b is 5
np.packbits(b)
works but only for 8 bit values ..if the numpy is 9 or more elements it generates 2 or more 8bit values. Another option would be to return a string of 0|1 ...
What I currently do is :
ba = bitarray()
ba.pack(b.astype(np.bool).tostring())
#convert from bitarray 0|1 to integer
result = int( ba.to01(), 2 )
which is ugly !!!
回答1:
One way would be using dot-product with 2-powered
range array -
b.dot(2**np.arange(b.size)[::-1])
Sample run -
In [95]: b = np.array([1,0,1,0,0,0,0,0,1,0,1])
In [96]: b.dot(2**np.arange(b.size)[::-1])
Out[96]: 1285
Alternatively, we could use bitwise left-shift operator to create the range array and thus get the desired output, like so -
b.dot(1 << np.arange(b.size)[::-1])
If timings are of interest -
In [148]: b = np.random.randint(0,2,(50))
In [149]: %timeit b.dot(2**np.arange(b.size)[::-1])
100000 loops, best of 3: 13.1 µs per loop
In [150]: %timeit b.dot(1 << np.arange(b.size)[::-1])
100000 loops, best of 3: 7.92 µs per loop
Reverse process
To retrieve back the binary array, use np.binary_repr alongwith np.fromstring -
In [96]: b = np.array([1,0,1,0,0,0,0,0,1,0,1])
In [97]: num = b.dot(2**np.arange(b.size)[::-1]) # integer
In [98]: np.fromstring(np.binary_repr(num), dtype='S1').astype(int)
Out[98]: array([1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1])
回答2:
Using numpy for conversion limits you to 64-bit signed binary results. If you really want to use numpy and the 64-bit limit works for you a faster implementation using numpy is:
import numpy as np
def bin2int(bits):
return np.right_shift(np.packbits(bits, -1), bits.size).squeeze()
Since normally if you are using numpy you care about speed then the fastest implementation for > 64-bit results is:
import gmpy2
def bin2int(bits):
return gmpy2.pack(list(bits[::-1]), 1)
If you don't want to grab a dependency on gmpy2 this is a little slower but has no dependencies and supports > 64-bit results:
def bin2int(bits):
total = 0
for shift, j in enumerate(bits[::-1]):
if j:
total += 1 << shift
return total
The observant will note some similarities in the last version to other Answers to this question with the main difference being the use of the << operator instead of **, in my testing this led to a significant improvement in speed.
回答3:
I extended the good dot product solution of @Divikar to run ~180x faster on my host, by using vectorized matrix multiplication code. The original code that runs one-row-at-a-time took ~3 minutes to run 100K rows of 18 columns in my pandas dataframe. Well, next week I need to upgrade from 100K rows to 20M rows, so ~10 hours of running time was not going to be fast enough for me. The new code is vectorized, first of all. That's the real change in the python code. Secondly, matmult often runs in parallel without you seeing it, on many-core processors depending on your host configuration, especially when OpenBLAS or other BLAS is present for numpy to use on matrix algebra like this matmult. So it can use a lot of processors and cores, if you have it.
The new -- quite simple -- code runs 100K rows x 18 binary columns in ~1 sec ET on my host which is "mission accomplished" for me:
'''
Fast way is vectorized matmult. Pass in all rows and cols in one shot.
'''
def BitsToIntAFast(bits):
m,n = bits.shape # number of columns is needed, not bits.size
a = 2**np.arange(n)[::-1] # -1 reverses array of powers of 2 of same length as bits
return bits @ a # this matmult is the key line of code
'''I use it like this:'''
bits = d.iloc[:,4:(4+18)] # read bits from my pandas dataframe
gs = BitsToIntAFast(bits)
print(gs[:5])
gs.shape
...
d['genre'] = np.array(gs) # add the newly computed column to pandas
Hope this helps.
回答4:
def binary_converter(arr):
total = 0
for index, val in enumerate(reversed(arr)):
total += (val * 2**index)
print total
In [14]: b = np.array([1,0,1,0,0,0,0,0,1,0,1])
In [15]: binary_converter(b)
1285
In [9]: b = np.array([0,0,0,0,0,1,0,1])
In [10]: binary_converter(b)
5
or
b = np.array([1,0,1,0,0,0,0,0,1,0,1])
sum(val * 2**index for index, val in enumerate(reversed(b)))
来源:https://stackoverflow.com/questions/41069825/convert-binary-01-numpy-to-integer-or-binary-string