Fastest way to check if a value exists in a list

后端 未结 12 2087
猫巷女王i
猫巷女王i 2020-11-22 00:18

What is the fastest way to know if a value exists in a list (a list with millions of values in it) and what its index is?

I know that all values in the list are uniqu

相关标签:
12条回答
  • 2020-11-22 00:33
    def check_availability(element, collection: iter):
        return element in collection
    

    Usage

    check_availability('a', [1,2,3,4,'a','b','c'])
    

    I believe this is the fastest way to know if a chosen value is in an array.

    0 讨论(0)
  • 2020-11-22 00:39

    It sounds like your application might gain advantage from the use of a Bloom Filter data structure.

    In short, a bloom filter look-up can tell you very quickly if a value is DEFINITELY NOT present in a set. Otherwise, you can do a slower look-up to get the index of a value that POSSIBLY MIGHT BE in the list. So if your application tends to get the "not found" result much more often then the "found" result, you might see a speed up by adding a Bloom Filter.

    For details, Wikipedia provides a good overview of how Bloom Filters work, and a web search for "python bloom filter library" will provide at least a couple useful implementations.

    0 讨论(0)
  • 2020-11-22 00:40

    You could put your items into a set. Set lookups are very efficient.

    Try:

    s = set(a)
    if 7 in s:
      # do stuff
    

    edit In a comment you say that you'd like to get the index of the element. Unfortunately, sets have no notion of element position. An alternative is to pre-sort your list and then use binary search every time you need to find an element.

    0 讨论(0)
  • 2020-11-22 00:42

    Because the question is not always supposed to be understood as the fastest technical way - I always suggest the most straightforward fastest way to understand/write: a list comprehension, one-liner

    [i for i in list_from_which_to_search if i in list_to_search_in]
    

    I had a list_to_search_in with all the items, and wanted to return the indexes of the items in the list_from_which_to_search.

    This returns the indexes in a nice list.

    There are other ways to check this problem - however list comprehensions are quick enough, adding to the fact of writing it quick enough, to solve a problem.

    0 讨论(0)
  • 2020-11-22 00:44

    This is not the code, but the algorithm for very fast searching.

    If your list and the value you are looking for are all numbers, this is pretty straightforward. If strings: look at the bottom:

    • -Let "n" be the length of your list
    • -Optional step: if you need the index of the element: add a second column to the list with current index of elements (0 to n-1) - see later
    • Order your list or a copy of it (.sort())
    • Loop through:
      • Compare your number to the n/2th element of the list
        • If larger, loop again between indexes n/2-n
        • If smaller, loop again between indexes 0-n/2
        • If the same: you found it
    • Keep narrowing the list until you have found it or only have 2 numbers (below and above the one you are looking for)
    • This will find any element in at most 19 steps for a list of 1.000.000 (log(2)n to be precise)

    If you also need the original position of your number, look for it in the second, index column.

    If your list is not made of numbers, the method still works and will be fastest, but you may need to define a function which can compare/order strings.

    Of course, this needs the investment of the sorted() method, but if you keep reusing the same list for checking, it may be worth it.

    0 讨论(0)
  • 2020-11-22 00:46
    a = [4,2,3,1,5,6]
    
    index = dict((y,x) for x,y in enumerate(a))
    try:
       a_index = index[7]
    except KeyError:
       print "Not found"
    else:
       print "found"
    

    This will only be a good idea if a doesn't change and thus we can do the dict() part once and then use it repeatedly. If a does change, please provide more detail on what you are doing.

    0 讨论(0)
提交回复
热议问题