Select certain rows (condition met), but only some columns in Python/Numpy

前端 未结 5 2025
一整个雨季
一整个雨季 2020-12-15 05:29

I have an numpy array with 4 columns and want to select columns 1, 3 and 4, where the value of the second column meets a certain condition (i.e. a fixed value). I tried to f

相关标签:
5条回答
  • 2020-12-15 05:46
    >>> a=np.array([[1,2,3], [1,3,4], [2,2,5]])
    >>> a[a[:,0]==1][:,[0,1]]
    array([[1, 2],
           [1, 3]])
    >>> 
    
    0 讨论(0)
  • 2020-12-15 05:54

    This also works.

    I = np.array([row[[x for x in range(A.shape[1]) if x != i-1]] for row in A if row[i-1] == i])
    print I
    

    Edit: Since indexing starts from 0, so

    i-1
    

    should be used.

    0 讨论(0)
  • 2020-12-15 05:56

    If you do not want to use boolean positions but the indexes, you can write it this way:

    A[:, [0, 2, 3]][A[:, 1] == i]
    

    Going back to your example:

    >>> A = np.array([[1,2,3,4],[6,1,3,4],[3,2,5,6]])
    >>> print A
    [[1 2 3 4]
     [6 1 3 4]
     [3 2 5 6]]
    >>> i = 2
    >>> print A[:, [0, 2, 3]][A[:, 1] == i]
    [[1 3 4]
     [3 5 6]]
    

    Seriously,

    0 讨论(0)
  • 2020-12-15 05:57

    I am hoping this answers your question but a piece of script I have implemented using pandas is:

    df_targetrows = df.loc[df[col2filter]*somecondition*, [col1,col2,...,coln]]
    

    For example,

    targets = stockdf.loc[stockdf['rtns'] > .04, ['symbol','date','rtns']]
    

    this will return a dataframe with only columns ['symbol','date','rtns'] from stockdf where the row value of rtns satisfies, stockdf['rtns'] > .04

    hope this helps

    0 讨论(0)
  • 2020-12-15 06:05
    >>> a = np.array([[1,2,3,4],[5,6,7,8],[9,10,11,12]])
    >>> a
    array([[ 1,  2,  3,  4],
           [ 5,  6,  7,  8],
           [ 9, 10, 11, 12]])
    
    >>> a[a[:,0] > 3] # select rows where first column is greater than 3
    array([[ 5,  6,  7,  8],
           [ 9, 10, 11, 12]])
    
    >>> a[a[:,0] > 3][:,np.array([True, True, False, True])] # select columns
    array([[ 5,  6,  8],
           [ 9, 10, 12]])
    
    # fancier equivalent of the previous
    >>> a[np.ix_(a[:,0] > 3, np.array([True, True, False, True]))]
    array([[ 5,  6,  8],
           [ 9, 10, 12]])
    

    For an explanation of the obscure np.ix_(), see https://stackoverflow.com/a/13599843/4323

    Finally, we can simplify by giving the list of column numbers instead of the tedious boolean mask:

    >>> a[np.ix_(a[:,0] > 3, (0,1,3))]
    array([[ 5,  6,  8],
           [ 9, 10, 12]])
    
    0 讨论(0)
提交回复
热议问题