Slow bitwise operations

后端 未结 2 1035
臣服心动
臣服心动 2021-01-02 09:08

I am working on a Python library that performs a lot of bitwise operations on long bit strings, and I want to find a bit string type that will maximize its speed. I have tri

相关标签:
2条回答
  • 2021-01-02 09:35

    What you are trying to test - are these vector operations at all? You are simply trying to compare speeds of 1 operation and there plain python is going to win 'cos it doesn't have to setup numpy arrays or bitarrays.

    How about trying out following?

    x = np.array([random.randrange(2**31)]*1000) 
    y = np.array([random.randrange(2**31)]*1000) 
    
    %timeit x & y # in ipython
    
    %timeit [ a & b for (a,b) in zip(x,y)] # even though x and y are numpy arrays, we are iterating over them - and not doing any vector operations
    

    Interestingly, if

    xxx = [random.randrange(2**31)] * 1000
    yyy = [random.randrange(2**31)] * 1000 
    

    and then

    %timeit [a & b for (a,b) in zip(xxx,yyy)]
    

    pure python lists, iterating over them is faster than iterating over numpy arrays.. a bit counter intuitive. Not sure why.

    Similarly you can try for bitstrings and bitarrays

    Is this what you are looking at?

    0 讨论(0)
  • 2021-01-02 09:50

    As far as I can tell, the built-in Python 3 int is the only one of the options you tested that computes the & in chunks of more than one byte at a time. (I haven't fully figured out what everything in the NumPy source for this operation does, but it doesn't look like it has an optimization to compute this in chunks bigger than the dtype.)

    • bitarray goes byte-by-byte,
    • the bool and 1-bit-per-int NumPy attempts go bit by bit,
    • the packed NumPy attempt goes byte-by-byte, and
    • the bitstring source goes byte-by-byte, as well as doing some things that screw up its attempts to gain speed through Cython, making it by far the slowest.

    In contrast, the int operation goes by either 15-bit or 30-bit digits, depending on the value of the compile-time parameter PYLONG_BITS_IN_DIGIT. I don't know which setting is the default.

    You can speed up the NumPy attempt by using a packed representation and a larger dtype. It looks like on my machine, a 32-bit dtype works fastest, beating Python ints; I don't know what it's like on your setup. Testing with 10240-bit values in each format, I get

    >>> timeit.timeit('a & b', 'import numpy; a = b = numpy.array([0]*160, dtype=num
    py.uint64)')
    1.3918750826524047
    >>> timeit.timeit('a & b', 'import numpy; a = b = numpy.array([0]*160*8, dtype=n
    umpy.uint8)')
    1.9460716604953632
    >>> timeit.timeit('a & b', 'import numpy; a = b = numpy.array([0]*160*2, dtype=n
    umpy.uint32)')
    1.1728465435917315
    >>> timeit.timeit('a & b', 'a = b = 2**10240-1')
    1.5999407862400403
    
    0 讨论(0)
提交回复
热议问题