fastest way to convert bitstring numpy array to integer base 2

你。 提交于 2019-12-04 11:36:30
Divakar

One could use np.fromstring to separate out each of the string bits into uint8 type numerals and then use some maths with matrix-multiplication to convert/reduce to decimal format. Thus, with A as the input array, one approach would be like so -

# Convert each bit of input string to numerals
str2num = (np.fromstring(A, dtype=np.uint8)-48).reshape(-1,4)

# Setup conversion array for binary number to decimal equivalent
de2bi_convarr = 2**np.arange(3,-1,-1)

# Use matrix multiplication for reducing each row of str2num to a single decimal
out = str2num.dot(de2bi_convarr)

Sample run -

In [113]: A    # Modified to show more variety
Out[113]: 
array([['0001'],
       ['1001'],
       ['1100'],
       ['0010']], 
      dtype='|S4')

In [114]: str2num = (np.fromstring(A, dtype=np.uint8)-48).reshape(-1,4)

In [115]: str2num
Out[115]: 
array([[0, 0, 0, 1],
       [1, 0, 0, 1],
       [1, 1, 0, 0],
       [0, 0, 1, 0]], dtype=uint8)

In [116]: de2bi_convarr = 2**np.arange(3,-1,-1)

In [117]: de2bi_convarr
Out[117]: array([8, 4, 2, 1])

In [118]: out = str2num.dot(de2bi_convarr)

In [119]: out
Out[119]: array([ 1,  9, 12,  2])

An alternative method could be suggested to avoid np.fromstring. With this method, we would convert to int datatype at the start, then separate out each digit, which should be equivalent of str2num in the previous method. Rest of the code would stay the same. Thus, an alternative implementation would be -

# Convert to int array and thus convert each bit of input string to numerals
str2num = np.remainder(A.astype(np.int)//(10**np.arange(3,-1,-1)),10)

de2bi_convarr = 2**np.arange(3,-1,-1)
out = str2num.dot(de2bi_convarr)

Runtime tests

Let's time all the approaches listed thus far to solve the problem, including @Kasramvd's loopy solution.

In [198]: # Setup a huge array of such strings
     ...: A = np.array([['0001'],['1001'],['1100'],['0010']],dtype='|S4')
     ...: A = A.repeat(10000,axis=0)


In [199]: def app1(A):             
     ...:     str2num = (np.fromstring(A, dtype=np.uint8)-48).reshape(-1,4)
     ...:     de2bi_convarr = 2**np.arange(3,-1,-1)
     ...:     out = str2num.dot(de2bi_convarr)    
     ...:     return out
     ...: 
     ...: def app2(A):             
     ...:     str2num = np.remainder(A.astype(np.int)//(10**np.arange(3,-1,-1)),10)
     ...:     de2bi_convarr = 2**np.arange(3,-1,-1)
     ...:     out = str2num.dot(de2bi_convarr)    
     ...:     return out
     ...: 

In [200]: %timeit app1(A)
1000 loops, best of 3: 1.46 ms per loop

In [201]: %timeit app2(A)
10 loops, best of 3: 36.6 ms per loop

In [202]: %timeit np.array([[int(i[0], 2)] for i in A]) # @Kasramvd's solution
10 loops, best of 3: 61.6 ms per loop

Due to KISS principle, I'd like to suggest the following approach using a list comprehension:

>>> np.array([[int(i[0], 2)] for i in a])
array([[1],
       [2]])
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!