How do I get indices of N maximum values in a NumPy array?

后端 未结 18 1251
长情又很酷
长情又很酷 2020-11-22 04:25

NumPy proposes a way to get the index of the maximum value of an array via np.argmax.

I would like a similar thing, but returning the indexes of the

相关标签:
18条回答
  • 2020-11-22 04:46

    Method np.argpartition only returns the k largest indices, performs a local sort, and is faster than np.argsort(performing a full sort) when array is quite large. But the returned indices are NOT in ascending/descending order. Let's say with an example:

    We can see that if you want a strict ascending order top k indices, np.argpartition won't return what you want.

    Apart from doing a sort manually after np.argpartition, my solution is to use PyTorch, torch.topk, a tool for neural network construction, providing NumPy-like APIs with both CPU and GPU support. It's as fast as NumPy with MKL, and offers a GPU boost if you need large matrix/vector calculations.

    Strict ascend/descend top k indices code will be:

    Note that torch.topk accepts a torch tensor, and returns both top k values and top k indices in type torch.Tensor. Similar with np, torch.topk also accepts an axis argument so that you can handle multi-dimensional arrays/tensors.

    0 讨论(0)
  • 2020-11-22 04:46

    bottleneck has a partial sort function, if the expense of sorting the entire array just to get the N largest values is too great.

    I know nothing about this module; I just googled numpy partial sort.

    0 讨论(0)
  • 2020-11-22 04:47

    Newer NumPy versions (1.8 and up) have a function called argpartition for this. To get the indices of the four largest elements, do

    >>> a = np.array([9, 4, 4, 3, 3, 9, 0, 4, 6, 0])
    >>> a
    array([9, 4, 4, 3, 3, 9, 0, 4, 6, 0])
    >>> ind = np.argpartition(a, -4)[-4:]
    >>> ind
    array([1, 5, 8, 0])
    >>> a[ind]
    array([4, 9, 6, 9])
    

    Unlike argsort, this function runs in linear time in the worst case, but the returned indices are not sorted, as can be seen from the result of evaluating a[ind]. If you need that too, sort them afterwards:

    >>> ind[np.argsort(a[ind])]
    array([1, 8, 5, 0])
    

    To get the top-k elements in sorted order in this way takes O(n + k log k) time.

    0 讨论(0)
  • 2020-11-22 04:47

    For multidimensional arrays you can use the axis keyword in order to apply the partitioning along the expected axis.

    # For a 2D array
    indices = np.argpartition(arr, -N, axis=1)[:, -N:]
    

    And for grabbing the items:

    x = arr.shape[0]
    arr[np.repeat(np.arange(x), N), indices.ravel()].reshape(x, N)
    

    But note that this won't return a sorted result. In that case you can use np.argsort() along the intended axis:

    indices = np.argsort(arr, axis=1)[:, -N:]
    
    # Result
    x = arr.shape[0]
    arr[np.repeat(np.arange(x), N), indices.ravel()].reshape(x, N)
    

    Here is an example:

    In [42]: a = np.random.randint(0, 20, (10, 10))
    
    In [44]: a
    Out[44]:
    array([[ 7, 11, 12,  0,  2,  3,  4, 10,  6, 10],
           [16, 16,  4,  3, 18,  5, 10,  4, 14,  9],
           [ 2,  9, 15, 12, 18,  3, 13, 11,  5, 10],
           [14,  0,  9, 11,  1,  4,  9, 19, 18, 12],
           [ 0, 10,  5, 15,  9, 18,  5,  2, 16, 19],
           [14, 19,  3, 11, 13, 11, 13, 11,  1, 14],
           [ 7, 15, 18,  6,  5, 13,  1,  7,  9, 19],
           [11, 17, 11, 16, 14,  3, 16,  1, 12, 19],
           [ 2,  4, 14,  8,  6,  9, 14,  9,  1,  5],
           [ 1, 10, 15,  0,  1,  9, 18,  2,  2, 12]])
    
    In [45]: np.argpartition(a, np.argmin(a, axis=0))[:, 1:] # 1 is because the first item is the minimum one.
    Out[45]:
    array([[4, 5, 6, 8, 0, 7, 9, 1, 2],
           [2, 7, 5, 9, 6, 8, 1, 0, 4],
           [5, 8, 1, 9, 7, 3, 6, 2, 4],
           [4, 5, 2, 6, 3, 9, 0, 8, 7],
           [7, 2, 6, 4, 1, 3, 8, 5, 9],
           [2, 3, 5, 7, 6, 4, 0, 9, 1],
           [4, 3, 0, 7, 8, 5, 1, 2, 9],
           [5, 2, 0, 8, 4, 6, 3, 1, 9],
           [0, 1, 9, 4, 3, 7, 5, 2, 6],
           [0, 4, 7, 8, 5, 1, 9, 2, 6]])
    
    In [46]: np.argpartition(a, np.argmin(a, axis=0))[:, -3:]
    Out[46]:
    array([[9, 1, 2],
           [1, 0, 4],
           [6, 2, 4],
           [0, 8, 7],
           [8, 5, 9],
           [0, 9, 1],
           [1, 2, 9],
           [3, 1, 9],
           [5, 2, 6],
           [9, 2, 6]])
    
    In [89]: a[np.repeat(np.arange(x), 3), ind.ravel()].reshape(x, 3)
    Out[89]:
    array([[10, 11, 12],
           [16, 16, 18],
           [13, 15, 18],
           [14, 18, 19],
           [16, 18, 19],
           [14, 14, 19],
           [15, 18, 19],
           [16, 17, 19],
           [ 9, 14, 14],
           [12, 15, 18]])
    
    0 讨论(0)
  • 2020-11-22 04:50

    Use:

    def max_indices(arr, k):
        '''
        Returns the indices of the k first largest elements of arr
        (in descending order in values)
        '''
        assert k <= arr.size, 'k should be smaller or equal to the array size'
        arr_ = arr.astype(float)  # make a copy of arr
        max_idxs = []
        for _ in range(k):
            max_element = np.max(arr_)
            if np.isinf(max_element):
                break
            else:
                idx = np.where(arr_ == max_element)
            max_idxs.append(idx)
            arr_[idx] = -np.inf
        return max_idxs
    

    It also works with 2D arrays. For example,

    In [0]: A = np.array([[ 0.51845014,  0.72528114],
                         [ 0.88421561,  0.18798661],
                         [ 0.89832036,  0.19448609],
                         [ 0.89832036,  0.19448609]])
    In [1]: max_indices(A, 8)
    Out[1]:
        [(array([2, 3], dtype=int64), array([0, 0], dtype=int64)),
         (array([1], dtype=int64), array([0], dtype=int64)),
         (array([0], dtype=int64), array([1], dtype=int64)),
         (array([0], dtype=int64), array([0], dtype=int64)),
         (array([2, 3], dtype=int64), array([1, 1], dtype=int64)),
         (array([1], dtype=int64), array([1], dtype=int64))]
    
    In [2]: A[max_indices(A, 8)[0]][0]
    Out[2]: array([ 0.89832036])
    
    0 讨论(0)
  • 2020-11-22 04:54

    Simpler yet:

    idx = (-arr).argsort()[:n]
    

    where n is the number of maximum values.

    0 讨论(0)
提交回复
热议问题