In this post Why is processing a sorted array faster than random array, it says that branch predicton is the reason of the performance boost in sorted arrays.
But I just
I ported the original code to Python and ran it with PyPy. I can confirm that sorted arrays are processed faster than unsorted arrays, and that the branchless method also works to eliminate the branch with running time similar to the sorted array. I believe this is because PyPy is a JIT compiler and so branch prediction is happening.
[edit]
Here's the code I used:
import random import time def runme(data): sum = 0 start = time.time() for i in xrange(100000): for c in data: if c >= 128: sum += c end = time.time() print end - start print sum def runme_branchless(data): sum = 0 start = time.time() for i in xrange(100000): for c in data: t = (c - 128) >> 31 sum += ~t & c end = time.time() print end - start print sum data = list() for i in xrange(32768): data.append(random.randint(0, 256)) sorted_data = sorted(data) runme(sorted_data) runme(data) runme_branchless(sorted_data) runme_branchless(data)