Is there a NumPy function to return the first index of something in an array?

前端 未结 13 1641
萌比男神i
萌比男神i 2020-11-22 05:55

I know there is a method for a Python list to return the first index of something:

>>> l = [1, 2, 3]
>>> l.index(2)
1

Is

相关标签:
13条回答
  • 2020-11-22 06:10

    Just to add a very performant and handy numba alternative based on np.ndenumerate to find the first index:

    from numba import njit
    import numpy as np
    
    @njit
    def index(array, item):
        for idx, val in np.ndenumerate(array):
            if val == item:
                return idx
        # If no item was found return None, other return types might be a problem due to
        # numbas type inference.
    

    This is pretty fast and deals naturally with multidimensional arrays:

    >>> arr1 = np.ones((100, 100, 100))
    >>> arr1[2, 2, 2] = 2
    
    >>> index(arr1, 2)
    (2, 2, 2)
    
    >>> arr2 = np.ones(20)
    >>> arr2[5] = 2
    
    >>> index(arr2, 2)
    (5,)
    

    This can be much faster (because it's short-circuiting the operation) than any approach using np.where or np.nonzero.


    However np.argwhere could also deal gracefully with multidimensional arrays (you would need to manually cast it to a tuple and it's not short-circuited) but it would fail if no match is found:

    >>> tuple(np.argwhere(arr1 == 2)[0])
    (2, 2, 2)
    >>> tuple(np.argwhere(arr2 == 2)[0])
    (5,)
    
    0 讨论(0)
  • 2020-11-22 06:11

    Yes, given an array, array, and a value, item to search for, you can use np.where as:

    itemindex = numpy.where(array==item)
    

    The result is a tuple with first all the row indices, then all the column indices.

    For example, if an array is two dimensions and it contained your item at two locations then

    array[itemindex[0][0]][itemindex[1][0]]
    

    would be equal to your item and so would be:

    array[itemindex[0][1]][itemindex[1][1]]
    
    0 讨论(0)
  • 2020-11-22 06:11

    An alternative to selecting the first element from np.where() is to use a generator expression together with enumerate, such as:

    >>> import numpy as np
    >>> x = np.arange(100)   # x = array([0, 1, 2, 3, ... 99])
    >>> next(i for i, x_i in enumerate(x) if x_i == 2)
    2
    

    For a two dimensional array one would do:

    >>> x = np.arange(100).reshape(10,10)   # x = array([[0, 1, 2,... 9], [10,..19],])
    >>> next((i,j) for i, x_i in enumerate(x) 
    ...            for j, x_ij in enumerate(x_i) if x_ij == 2)
    (0, 2)
    

    The advantage of this approach is that it stops checking the elements of the array after the first match is found, whereas np.where checks all elements for a match. A generator expression would be faster if there's match early in the array.

    0 讨论(0)
  • 2020-11-22 06:15

    For one-dimensional sorted arrays, it would be much more simpler and efficient O(log(n)) to use numpy.searchsorted which returns a NumPy integer (position). For example,

    arr = np.array([1, 1, 1, 2, 3, 3, 4])
    i = np.searchsorted(arr, 3)
    

    Just make sure the array is already sorted

    Also check if returned index i actually contains the searched element, since searchsorted's main objective is to find indices where elements should be inserted to maintain order.

    if arr[i] == 3:
        print("present")
    else:
        print("not present")
    
    0 讨论(0)
  • 2020-11-22 06:16

    If you need the index of the first occurrence of only one value, you can use nonzero (or where, which amounts to the same thing in this case):

    >>> t = array([1, 1, 1, 2, 2, 3, 8, 3, 8, 8])
    >>> nonzero(t == 8)
    (array([6, 8, 9]),)
    >>> nonzero(t == 8)[0][0]
    6
    

    If you need the first index of each of many values, you could obviously do the same as above repeatedly, but there is a trick that may be faster. The following finds the indices of the first element of each subsequence:

    >>> nonzero(r_[1, diff(t)[:-1]])
    (array([0, 3, 5, 6, 7, 8]),)
    

    Notice that it finds the beginning of both subsequence of 3s and both subsequences of 8s:

    [1, 1, 1, 2, 2, 3, 8, 3, 8, 8]

    So it's slightly different than finding the first occurrence of each value. In your program, you may be able to work with a sorted version of t to get what you want:

    >>> st = sorted(t)
    >>> nonzero(r_[1, diff(st)[:-1]])
    (array([0, 3, 5, 7]),)
    
    0 讨论(0)
  • 2020-11-22 06:18

    For 1D arrays, I'd recommend np.flatnonzero(array == value)[0], which is equivalent to both np.nonzero(array == value)[0][0] and np.where(array == value)[0][0] but avoids the ugliness of unboxing a 1-element tuple.

    0 讨论(0)
提交回复
热议问题