问题
Is there a library function that performs binary search on a list/tuple and return the position of the item if found and 'False' (-1, None, etc.) if not?
I found the functions bisect_left/right in the bisect module, but they still return a position even if the item is not in the list. That's perfectly fine for their intended usage, but I just want to know if an item is in the list or not (don't want to insert anything).
I thought of using bisect_left
and then checking if the item at that position is equal to what I'm searching, but that seems cumbersome (and I also need to do bounds checking if the number can be larger than the largest number in my list). If there is a nicer method I'd like to know about it.
Edit To clarify what I need this for: I'm aware that a dictionary would be very well suited for this, but I'm trying to keep the memory consumption as low as possible. My intended usage would be a sort of double-way look-up table. I have in the table a list of values and I need to be able to access the values based on their index. And also I want to be able to find the index of a particular value or None if the value is not in the list.
Using a dictionary for this would be the fastest way, but would (approximately) double the memory requirements.
I was asking this question thinking that I may have overlooked something in the Python libraries. It seems I'll have to write my own code, as Moe suggested.
回答1:
from bisect import bisect_left
def binary_search(a, x, lo=0, hi=None): # can't use a to specify default for hi
hi = hi if hi is not None else len(a) # hi defaults to len(a)
pos = bisect_left(a, x, lo, hi) # find insertion position
return pos if pos != hi and a[pos] == x else -1 # don't walk off the end
回答2:
Why not look at the code for bisect_left/right and adapt it to suit your purpose.
like this:
def binary_search(a, x, lo=0, hi=None):
if hi is None:
hi = len(a)
while lo < hi:
mid = (lo+hi)//2
midval = a[mid]
if midval < x:
lo = mid+1
elif midval > x:
hi = mid
else:
return mid
return -1
回答3:
This is a little off-topic (since Moe's answer seems complete to the OP's question), but it might be worth looking at the complexity for your whole procedure from end to end. If you're storing thing in a sorted lists (which is where a binary search would help), and then just checking for existence, you're incurring (worst-case, unless specified):
Sorted Lists
- O( n log n) to initially create the list (if it's unsorted data. O(n), if it's sorted )
- O( log n) lookups (this is the binary search part)
- O( n ) insert / delete (might be O(1) or O(log n) average case, depending on your pattern)
Whereas with a set(), you're incurring
- O(n) to create
- O(1) lookup
- O(1) insert / delete
The thing a sorted list really gets you are "next", "previous", and "ranges" (including inserting or deleting ranges), which are O(1) or O(|range|), given a starting index. If you aren't using those sorts of operations often, then storing as sets, and sorting for display might be a better deal overall. set() incurs very little additional overhead in python.
回答4:
It might be worth mentioning that the bisect docs now provide searching examples: http://docs.python.org/library/bisect.html#searching-sorted-lists
(Raising ValueError instead of returning -1 or None is more pythonic – list.index() does it, for example. But of course you can adapt the examples to your needs.)
回答5:
Simplest is to use bisect and check one position back to see if the item is there:
def binary_search(a,x,lo=0,hi=-1):
i = bisect(a,x,lo,hi)
if i == 0:
return -1
elif a[i-1] == x:
return i-1
else:
return -1
回答6:
This is right from the manual:
http://docs.python.org/2/library/bisect.html
8.5.1. Searching Sorted Lists
The above bisect() functions are useful for finding insertion points but can be tricky or awkward to use for common searching tasks. The following five functions show how to transform them into the standard lookups for sorted lists:
def index(a, x):
'Locate the leftmost value exactly equal to x'
i = bisect_left(a, x)
if i != len(a) and a[i] == x:
return i
raise ValueError
So with the slight modification your code should be:
def index(a, x):
'Locate the leftmost value exactly equal to x'
i = bisect_left(a, x)
if i != len(a) and a[i] == x:
return i
return -1
回答7:
I agree that @DaveAbrahams's answer using the bisect module is the correct approach. He did not mention one important detail in his answer.
From the docs bisect.bisect_left(a, x, lo=0, hi=len(a))
The bisection module does not require the search array to be precomputed ahead of time. You can just present the endpoints to the bisect.bisect_left
instead of it using the defaults of 0
and len(a)
.
Even more important for my use, looking for a value X such that the error of a given function is minimized. To do that, I needed a way to have the bisect_left's algorithm call my computation instead. This is really simple.
Just provide an object that defines __getitem__
as a
For example, we could use the bisect algorithm to find a square root with arbitrary precision!
import bisect
class sqrt_array(object):
def __init__(self, digits):
self.precision = float(10**(digits))
def __getitem__(self, key):
return (key/self.precision)**2.0
sa = sqrt_array(4)
# "search" in the range of 0 to 10 with a "precision" of 0.0001
index = bisect.bisect_left(sa, 7, 0, 10*10**4)
print 7**0.5
print index/(10**4.0)
回答8:
If you just want to see if it's present, try turning the list into a dict:
# Generate a list
l = [n*n for n in range(1000)]
# Convert to dict - doesn't matter what you map values to
d = dict((x, 1) for x in l)
count = 0
for n in range(1000000):
# Compare with "if n in l"
if n in d:
count += 1
On my machine, "if n in l" took 37 seconds, while "if n in d" took 0.4 seconds.
回答9:
This one is:
- not recursive (which makes it more memory-efficient than most recursive approaches)
- actually working
- fast since it runs without any unnecessary if's and conditions
- based on a mathematical assertion that the floor of (low + high)/2 is always smaller than high where low is the lower limit and high is the upper limit.
- tested :D
def binsearch(t, key, low = 0, high = len(t) - 1):
# bisecting the range
while low < high:
mid = (low + high)//2
if t[mid] < key:
low = mid + 1
else:
high = mid
# at this point 'low' should point at the place
# where the value of 'key' is possibly stored.
return low if t[low] == key else -1
回答10:
Dave Abrahams' solution is good. Although I have would have done it minimalistic:
def binary_search(L, x):
i = bisect.bisect_left(L, x)
if i == len(L) or L[i] != x:
return -1
return i
回答11:
While there's no explicit binary search algorithm in Python, there is a module - bisect
- designed to find the insertion point for an element in a sorted list using a binary search. This can be "tricked" into performing a binary search. The biggest advantage of this is the same advantage most library code has - it's high-performing, well-tested and just works (binary searches in particular can be quite difficult to implement successfully - particularly if edge cases aren't carefully considered).
Basic Types
For basic types like Strings or ints it's pretty easy - all you need is the bisect
module and a sorted list:
>>> import bisect
>>> names = ['bender', 'fry', 'leela', 'nibbler', 'zoidberg']
>>> bisect.bisect_left(names, 'fry')
1
>>> keyword = 'fry'
>>> x = bisect.bisect_left(names, keyword)
>>> names[x] == keyword
True
>>> keyword = 'arnie'
>>> x = bisect.bisect_left(names, keyword)
>>> names[x] == keyword
False
You can also use this to find duplicates:
...
>>> names = ['bender', 'fry', 'fry', 'fry', 'leela', 'nibbler', 'zoidberg']
>>> keyword = 'fry'
>>> leftIndex = bisect.bisect_left(names, keyword)
>>> rightIndex = bisect.bisect_right(names, keyword)
>>> names[leftIndex:rightIndex]
['fry', 'fry', 'fry']
Obviously you could just return the index rather than the value at that index if desired.
Objects
For custom types or objects, things are a little bit trickier: you have to make sure to implement rich comparison methods to get bisect to compare correctly.
>>> import bisect
>>> class Tag(object): # a simple wrapper around strings
... def __init__(self, tag):
... self.tag = tag
... def __lt__(self, other):
... return self.tag < other.tag
... def __gt__(self, other):
... return self.tag > other.tag
...
>>> tags = [Tag('bender'), Tag('fry'), Tag('leela'), Tag('nibbler'), Tag('zoidbe
rg')]
>>> key = Tag('fry')
>>> leftIndex = bisect.bisect_left(tags, key)
>>> rightIndex = bisect.bisect_right(tags, key)
>>> print([tag.tag for tag in tags[leftIndex:rightIndex]])
['fry']
This should work in at least Python 2.7 -> 3.3
回答12:
Using a dict wouldn't like double your memory usage unless the objects you're storing are really tiny, since the values are only pointers to the actual objects:
>>> a = 'foo'
>>> b = [a]
>>> c = [a]
>>> b[0] is c[0]
True
In that example, 'foo' is only stored once. Does that make a difference for you? And exactly how many items are we talking about anyway?
回答13:
This code works with integer lists in a recursive way. Looks for the simplest case scenario, which is: list length less than 2. It means the answer is already there and a test is performed to check for the correct answer. If not, a middle value is set and tested to be the correct, if not bisection is performed by calling again the function, but setting middle value as the upper or lower limit, by shifting it to the left or right
def binary_search(intList, intValue, lowValue, highValue): if(highValue - lowValue) < 2: return intList[lowValue] == intValue or intList[highValue] == intValue middleValue = lowValue + ((highValue - lowValue)/2) if intList[middleValue] == intValue: return True if intList[middleValue] > intValue: return binary_search(intList, intValue, lowValue, middleValue - 1) return binary_search(intList, intValue, middleValue + 1, highValue)
回答14:
Check out the examples on Wikipedia http://en.wikipedia.org/wiki/Binary_search_algorithm
def binary_search(a, key, imin=0, imax=None):
if imax is None:
# if max amount not set, get the total
imax = len(a) - 1
while imin <= imax:
# calculate the midpoint
mid = (imin + imax)//2
midval = a[mid]
# determine which subarray to search
if midval < key:
# change min index to search upper subarray
imin = mid + 1
elif midval > key:
# change max index to search lower subarray
imax = mid - 1
else:
# return index number
return mid
raise ValueError
回答15:
'''
Only used if set your position as global
'''
position #set global
def bst(array,taget): # just pass the array and target
global position
low = 0
high = len(array)
while low <= high:
mid = (lo+hi)//2
if a[mid] == target:
position = mid
return -1
elif a[mid] < target:
high = mid+1
else:
low = mid-1
return -1
I guess this is much better and effective. please correct me :) . Thank you
回答16:
s
is a list.binary(s, 0, len(s) - 1, find)
is the initial call.Function returns an index of the queried item. If there is no such item it returns
-1
.def binary(s,p,q,find): if find==s[(p+q)/2]: return (p+q)/2 elif p==q-1 or p==q: if find==s[q]: return q else: return -1 elif find < s[(p+q)/2]: return binary(s,p,(p+q)/2,find) elif find > s[(p+q)/2]: return binary(s,(p+q)/2+1,q,find)
回答17:
def binary_search_length_of_a_list(single_method_list):
index = 0
first = 0
last = 1
while True:
mid = ((first + last) // 2)
if not single_method_list.get(index):
break
index = mid + 1
first = index
last = index + 1
return mid
回答18:
Binary Search :
// List - values inside list
// searchItem - Item to search
// size - Size of list
// upperBound - higher index of list
// lowerBound - lower index of list
def binarySearch(list, searchItem, size, upperBound, lowerBound):
print(list)
print(upperBound)
print(lowerBound)
mid = ((upperBound + lowerBound)) // 2
print(mid)
if int(list[int(mid)]) == value:
return "value exist"
elif int(list[int(mid)]) < value:
return searchItem(list, value, size, upperBound, mid + 1)
elif int(list[int(mid)]) > value:
return searchItem(list, value, size, mid - 1, lowerBound)
// To call above function use :
list = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
searchItem = 1
print(searchItem(list[0], item, len(list[0]) -1, len(list[0]) - 1, 0))
回答19:
I needed binary search in python and generic for Django models. In Django models, one model can have foreign key to another model and I wanted to perform some search on the retrieved models objects. I wrote following function you can use this.
def binary_search(values, key, lo=0, hi=None, length=None, cmp=None):
"""
This is a binary search function which search for given key in values.
This is very generic since values and key can be of different type.
If they are of different type then caller must specify `cmp` function to
perform a comparison between key and values' item.
:param values: List of items in which key has to be search
:param key: search key
:param lo: start index to begin search
:param hi: end index where search will be performed
:param length: length of values
:param cmp: a comparator function which can be used to compare key and values
:return: -1 if key is not found else index
"""
assert type(values[0]) == type(key) or cmp, "can't be compared"
assert not (hi and length), "`hi`, `length` both can't be specified at the same time"
lo = lo
if not lo:
lo = 0
if hi:
hi = hi
elif length:
hi = length - 1
else:
hi = len(values) - 1
while lo <= hi:
mid = lo + (hi - lo) // 2
if not cmp:
if values[mid] == key:
return mid
if values[mid] < key:
lo = mid + 1
else:
hi = mid - 1
else:
val = cmp(values[mid], key)
# 0 -> a == b
# > 0 -> a > b
# < 0 -> a < b
if val == 0:
return mid
if val < 0:
lo = mid + 1
else:
hi = mid - 1
return -1
回答20:
Many good solutions above but I haven't seen a simple (KISS keep it simple (cause I'm) stupid use of the Python built in/generic bisect function to do a binary search. With a bit of code around the bisect function, I think I have an example below where I have tested all cases for a small string array of names. Some of the above solutions allude to/say this, but hopefully the simple code below will help anyone confused like I was.
Python bisect is used to indicate where to insert an a new value/search item into a sorted list. The below code which uses bisect_left which will return the index of the hit if the search item in the list/array is found (Note bisect and bisect_right will return the index of the element after the hit or match as the insertion point) If not found, bisect_left will return an index to the next item in the sorted list which will not == the search value. The only other case is where the search item would go at the end of the list where the index returned would be beyond the end of the list/array, and which in the code below the early exit by Python with "and" logic handles. (first condition False Python does not check subsequent conditions)
#Code
from bisect import bisect_left
names=["Adam","Donny","Jalan","Zach","Zayed"]
search=""
lenNames = len(names)
while search !="none":
search =input("Enter name to search for or 'none' to terminate program:")
if search == "none":
break
i = bisect_left(names,search)
print(i) # show index returned by Python bisect_left
if i < (lenNames) and names[i] == search:
print(names[i],"found") #return True - if function
else:
print(search,"not found") #return False – if function
##Exhaustive test cases:
##Enter name to search for or 'none' to terminate program:Zayed
##4
##Zayed found
##Enter name to search for or 'none' to terminate program:Zach
##3
##Zach found
##Enter name to search for or 'none' to terminate program:Jalan
##2
##Jalan found
##Enter name to search for or 'none' to terminate program:Donny
##1
##Donny found
##Enter name to search for or 'none' to terminate program:Adam
##0
##Adam found
##Enter name to search for or 'none' to terminate program:Abie
##0
##Abie not found
##Enter name to search for or 'none' to terminate program:Carla
##1
##Carla not found
##Enter name to search for or 'none' to terminate program:Ed
##2
##Ed not found
##Enter name to search for or 'none' to terminate program:Roger
##3
##Roger not found
##Enter name to search for or 'none' to terminate program:Zap
##4
##Zap not found
##Enter name to search for or 'none' to terminate program:Zyss
##5
##Zyss not found
来源:https://stackoverflow.com/questions/212358/binary-search-bisection-in-python