output of numpy.where(condition) is not an array, but a tuple of arrays: why?

前端 未结 3 1994
终归单人心
终归单人心 2020-12-23 14:08

I am experimenting with the numpy.where(condition[, x, y]) function.
From the numpy documentation, I learn that if you give just one array as input, it shou

相关标签:
3条回答
  • 2020-12-23 14:31

    In Python (1) means just 1. () can be freely added to group numbers and expressions for human readability (e.g. (1+3)*3 v (1+3,)*3). Thus to denote a 1 element tuple it uses (1,) (and requires you to use it as well).

    Thus

    (array([4, 5, 6, 7, 8]),)
    

    is a one element tuple, that element being an array.

    If you applied where to a 2d array, the result would be a 2 element tuple.

    The result of where is such that it can be plugged directly into an indexing slot, e.g.

    a[where(a>0)]
    a[a>0]
    

    should return the same things

    as would

    I,J = where(a>0)   # a is 2d
    a[I,J]
    a[(I,J)]
    

    Or with your example:

    In [278]: a=np.array([1,2,3,4,5,6,7,8,9])
    In [279]: np.where(a>4)
    Out[279]: (array([4, 5, 6, 7, 8], dtype=int32),)  # tuple
    
    In [280]: a[np.where(a>4)]
    Out[280]: array([5, 6, 7, 8, 9])
    
    In [281]: I=np.where(a>4)
    In [282]: I
    Out[282]: (array([4, 5, 6, 7, 8], dtype=int32),)
    In [283]: a[I]
    Out[283]: array([5, 6, 7, 8, 9])
    
    In [286]: i, = np.where(a>4)   # note the , on LHS
    In [287]: i
    Out[287]: array([4, 5, 6, 7, 8], dtype=int32)  # not tuple
    In [288]: a[i]
    Out[288]: array([5, 6, 7, 8, 9])
    In [289]: a[(i,)]
    Out[289]: array([5, 6, 7, 8, 9])
    

    ======================

    np.flatnonzero shows the correct way of returning just one array, regardless of the dimensions of the input array.

    In [299]: np.flatnonzero(a>4)
    Out[299]: array([4, 5, 6, 7, 8], dtype=int32)
    In [300]: np.flatnonzero(a>4)+10
    Out[300]: array([14, 15, 16, 17, 18], dtype=int32)
    

    It's doc says:

    This is equivalent to a.ravel().nonzero()[0]

    In fact that is literally what the function does.

    By flattening a removes the question of what to do with multiple dimensions. And then it takes the response out of the tuple, giving you a plain array. With flattening it doesn't have make a special case for 1d arrays.

    ===========================

    @Divakar suggests np.argwhere:

    In [303]: np.argwhere(a>4)
    Out[303]: 
    array([[4],
           [5],
           [6],
           [7],
           [8]], dtype=int32)
    

    which does np.transpose(np.where(a>4))

    Or if you don't like the column vector, you could transpose it again

    In [307]: np.argwhere(a>4).T
    Out[307]: array([[4, 5, 6, 7, 8]], dtype=int32)
    

    except now it is a 1xn array.

    We could just as well have wrapped where in array:

    In [311]: np.array(np.where(a>4))
    Out[311]: array([[4, 5, 6, 7, 8]], dtype=int32)
    

    Lots of ways of taking an array out the where tuple ([0], i,=, transpose, array, etc).

    0 讨论(0)
  • 2020-12-23 14:44

    Just use np.asarray function. In your case:

    >>> import numpy as np
    >>> array = np.array([1,2,3,4,5,6,7,8,9])
    >>> pippo = np.asarray(np.where(array>4))
    >>> pippo + 1
    array([[5, 6, 7, 8, 9]])
    
    0 讨论(0)
  • 2020-12-23 14:45

    Short answer: np.where is designed to have consistent output regardless of the dimension of the array.

    A two-dimensional array has two indices, so the result of np.where is a length-2 tuple containing the relevant indices. This generalizes to a length-3 tuple for 3-dimensions, a length-4 tuple for 4 dimensions, or a length-N tuple for N dimensions. By this rule, it is clear that in 1 dimension, the result should be a length-1 tuple.

    0 讨论(0)
提交回复
热议问题