问题
So, the correct way of calculating mid
in a binary search is mid = low + ((high - low) / 2)
in order to handle overflow errors.
My implementation uses unsigned 64 bit variables and I don't ever see a situation where my arrays get so big so as to cause an overflow. Do I still need use the above implementation or can I use mid = (low + high) / 2
What's best practice here?
回答1:
If there is no possibility of overflow, the overflow-safe way of computing the midpoint is technically unnecessary: you can use the unsafe formula if you wish. However, it's probably a good idea to keep it there anyway, in case that your program gets modified some day to break your assumptions. I think that adding a single CPU instruction to make your code future-proof is a great investment in maintainability of your code.
回答2:
Check this article Nearly All Binary Searches and Mergesorts are Broken
Better practice (for today)
Probably faster, and arguably as clear is: 6: int mid = (low + high) >>> 1;
and after that :
In C and C++ (where you don't have the >>> operator), you can do this: 6: mid = ((unsigned int)low + (unsigned int)high)) >> 1;
And at the end :
Update 17 Feb 2008: Thanks to Antoine Trux, Principal Member of Engineering Staff at Nokia Research Center Finland for pointing out that the original proposed fix for C and C++ (Line 6), was not guaranteed to work by the relevant C99 standard (INTERNATIONAL STANDARD - ISO/IEC - 9899 - Second edition - 1999-12-01, 3.4.3.3), which says that if you add two signed quantities and get an overflow, the result is undefined. The older C Standard, C89/90, and the C++ Standard are both identical to C99 in this respect. Now that we've made this change, we know that the program is correct;)
Bottom line, there always will be a case when it won't work
回答3:
Don Knuth's method works perfectly through a bitmask with no possibility of an overflow:
return (low & high) + ((low ^ high) >> 1)
来源:https://stackoverflow.com/questions/21101110/calculating-midpoint-index-in-binary-search