Slow bitwise operations

后端未结

关注

 2  1036

I am working on a Python library that performs a lot of bitwise operations on long bit strings, and I want to find a bit string type that will maximize its speed. I have tri

相关标签:

2条回答

既然无缘

2021-01-02 09:35
What you are trying to test - are these vector operations at all? You are simply trying to compare speeds of 1 operation and there plain python is going to win 'cos it doesn't have to setup numpy arrays or bitarrays.

How about trying out following?
```
x = np.array([random.randrange(2**31)]*1000) 
y = np.array([random.randrange(2**31)]*1000) 

%timeit x & y # in ipython

%timeit [ a & b for (a,b) in zip(x,y)] # even though x and y are numpy arrays, we are iterating over them - and not doing any vector operations
```
Interestingly, if
```
xxx = [random.randrange(2**31)] * 1000
yyy = [random.randrange(2**31)] * 1000 
```
and then
```
%timeit [a & b for (a,b) in zip(xxx,yyy)]
```
pure python lists, iterating over them is faster than iterating over numpy arrays.. a bit counter intuitive. Not sure why.

Similarly you can try for bitstrings and bitarrays

Is this what you are looking at?
0 讨论(0)
发布评论:

提交评论
- 加载中...
暗喜

2021-01-02 09:50
As far as I can tell, the built-in Python 3 int is the only one of the options you tested that computes the & in chunks of more than one byte at a time. (I haven't fully figured out what everything in the NumPy source for this operation does, but it doesn't look like it has an optimization to compute this in chunks bigger than the dtype.)
- bitarray goes byte-by-byte,
- the bool and 1-bit-per-int NumPy attempts go bit by bit,
- the packed NumPy attempt goes byte-by-byte, and
- the bitstring source goes byte-by-byte, as well as doing some things that screw up its attempts to gain speed through Cython, making it by far the slowest.
In contrast, the int operation goes by either 15-bit or 30-bit digits, depending on the value of the compile-time parameter PYLONG_BITS_IN_DIGIT. I don't know which setting is the default.

You can speed up the NumPy attempt by using a packed representation and a larger dtype. It looks like on my machine, a 32-bit dtype works fastest, beating Python ints; I don't know what it's like on your setup. Testing with 10240-bit values in each format, I get
```
>>> timeit.timeit('a & b', 'import numpy; a = b = numpy.array([0]*160, dtype=num
py.uint64)')
1.3918750826524047
>>> timeit.timeit('a & b', 'import numpy; a = b = numpy.array([0]*160*8, dtype=n
umpy.uint8)')
1.9460716604953632
>>> timeit.timeit('a & b', 'import numpy; a = b = numpy.array([0]*160*2, dtype=n
umpy.uint32)')
1.1728465435917315
>>> timeit.timeit('a & b', 'a = b = 2**10240-1')
1.5999407862400403
```
0 讨论(0)
发布评论:

提交评论
- 加载中...