using bisect on list of tuples but compare using first value only

前端 未结 4 1319
旧时难觅i
旧时难觅i 2021-01-18 14:15

I read that question about how to use bisect on a list of tuples, and I used that information to answer that question. It works, but I\'d like a more generic so

相关标签:
4条回答
  • 2021-01-18 14:47

    bisect supports arbitrary sequences. If you need to use bisect with a key, instead of passing the key to bisect, you can build it into the sequence:

    class KeyList(object):
        # bisect doesn't accept a key function, so we build the key into our sequence.
        def __init__(self, l, key):
            self.l = l
            self.key = key
        def __len__(self):
            return len(self.l)
        def __getitem__(self, index):
            return self.key(self.l[index])
    

    Then you can use bisect with a KeyList, with O(log n) performance and no need to copy the bisect source or write your own binary search:

    bisect.bisect_right(KeyList(test_array, key=lambda x: x[0]), 5)
    
    0 讨论(0)
  • 2021-01-18 14:47

    This is a (quick'n'dirty) bisect_left implementation that allows an arbitrary key function:

    def bisect(lst, value, key=None):
        if key is None:
            key = lambda x: x
        def bis(lo, hi=len(lst)):
            while lo < hi:
                mid = (lo + hi) // 2
                if key(lst[mid]) < value:
                    lo = mid + 1
                else:
                    hi = mid
            return lo
        return bis(0)
    
    > from _operator import itemgetter
    > test_array = [(1, 2), (3, 4), (4, 3), (5.2, 6), (5.2, 7000), (5.3, 8), (9, 10)]
    > print(bisect(test_array, 5, key=itemgetter(0)))
    3
    

    This keeps the O(log_N) performance up since it does not assemble a new list of keys. The implementation of binary search is widely available, but this was taken straight from the bisect_left source. It should also be noted that the list needs to be sorted with regard to the same key function.

    0 讨论(0)
  • 2021-01-18 14:56

    For this:

    ...want to find the first item where x > 5 for those (x,y) tuples (not considering y at all)

    Something like:

    import bisect
    test_array = [(1,2),(3,4),(5,6),(5,7000),(7,8),(9,10)]
    
    first_elem = [elem[0] for elem in test_array]
    print(bisect.bisect_right(first_elem, 5))
    

    The bisect_right function will take the first index past, and since you're just concerned with the first element of the tuple, this part seems straight forward. ...still not generalising to a specific key function I realize.

    As @Jean-FrançoisFabre pointed out, we're already processing the entire array, so using bisect may not even be very helpful.

    Not sure if it's any quicker, but we could alternatively use something like itertools (yes, this is a bit ugly):

    import itertools
    test_array = [(1,2),(3,4),(5,6),(5,7000),(7,8),(9,10)]
    
    print(itertools.ifilter(
        lambda tp: tp[1][0]>5, 
        ((ix, num) for ix, num in enumerate(test_array))).next()[0]
    )
    
    0 讨论(0)
  • 2021-01-18 15:01

    As an addition to the nice suggestions, I'd like to add my own answer which works with floats (as I just figured it out)

    bisect.bisect_left(test_array,(min_value+abs(min_value)*sys.float_info.epsilon),))
    

    would work (whether min_value is positive or not). epsilon multiplied by min_value is guaranteed to be meaningful when added to min_value (it is not absorbed/cancelled). So it's the closest greater value to min_value and bisect will work with that.

    If you have only integers that will still be faster & clearer:

    bisect.bisect_left(test_array,(min_value+1,))
    
    0 讨论(0)
提交回复
热议问题