Sorting arrays in NumPy by column

后端 未结 13 2219
既然无缘
既然无缘 2020-11-22 03:47

How can I sort an array in NumPy by the nth column?

For example,

a = array([[9, 2, 3],
           [4, 5, 6],
           [7, 0, 5]])

相关标签:
13条回答
  • 2020-11-22 04:12

    From the NumPy mailing list, here's another solution:

    >>> a
    array([[1, 2],
           [0, 0],
           [1, 0],
           [0, 2],
           [2, 1],
           [1, 0],
           [1, 0],
           [0, 0],
           [1, 0],
          [2, 2]])
    >>> a[np.lexsort(np.fliplr(a).T)]
    array([[0, 0],
           [0, 0],
           [0, 2],
           [1, 0],
           [1, 0],
           [1, 0],
           [1, 0],
           [1, 2],
           [2, 1],
           [2, 2]])
    
    0 讨论(0)
  • 2020-11-22 04:14

    I had a similar problem.

    My Problem:

    I want to calculate an SVD and need to sort my eigenvalues in descending order. But I want to keep the mapping between eigenvalues and eigenvectors. My eigenvalues were in the first row and the corresponding eigenvector below it in the same column.

    So I want to sort a two-dimensional array column-wise by the first row in descending order.

    My Solution

    a = a[::, a[0,].argsort()[::-1]]
    

    So how does this work?

    a[0,] is just the first row I want to sort by.

    Now I use argsort to get the order of indices.

    I use [::-1] because I need descending order.

    Lastly I use a[::, ...] to get a view with the columns in the right order.

    0 讨论(0)
  • 2020-11-22 04:15

    You can sort on multiple columns as per Steve Tjoa's method by using a stable sort like mergesort and sorting the indices from the least significant to the most significant columns:

    a = a[a[:,2].argsort()] # First sort doesn't need to be stable.
    a = a[a[:,1].argsort(kind='mergesort')]
    a = a[a[:,0].argsort(kind='mergesort')]
    

    This sorts by column 0, then 1, then 2.

    0 讨论(0)
  • 2020-11-22 04:15

    In case someone wants to make use of sorting at a critical part of their programs here's a performance comparison for the different proposals:

    import numpy as np
    table = np.random.rand(5000, 10)
    
    %timeit table.view('f8,f8,f8,f8,f8,f8,f8,f8,f8,f8').sort(order=['f9'], axis=0)
    1000 loops, best of 3: 1.88 ms per loop
    
    %timeit table[table[:,9].argsort()]
    10000 loops, best of 3: 180 µs per loop
    
    import pandas as pd
    df = pd.DataFrame(table)
    %timeit df.sort_values(9, ascending=True)
    1000 loops, best of 3: 400 µs per loop
    

    So, it looks like indexing with argsort is the quickest method so far...

    0 讨论(0)
  • 2020-11-22 04:20
    import numpy as np
    a=np.array([[21,20,19,18,17],[16,15,14,13,12],[11,10,9,8,7],[6,5,4,3,2]])
    y=np.argsort(a[:,2],kind='mergesort')# a[:,2]=[19,14,9,4]
    a=a[y]
    print(a)
    

    Desired output is [[6,5,4,3,2],[11,10,9,8,7],[16,15,14,13,12],[21,20,19,18,17]]

    note that argsort(numArray) returns the indices of an numArray as it was supposed to be arranged in a sorted manner.

    example

    x=np.array([8,1,5]) 
    z=np.argsort(x) #[1,3,0] are the **indices of the predicted sorted array**
    print(x[z]) #boolean indexing which sorts the array on basis of indices saved in z
    

    answer would be [1,5,8]

    0 讨论(0)
  • 2020-11-22 04:21

    It is an old question but if you need to generalize this to a higher than 2 dimension arrays, here is the solution than can be easily generalized:

    np.einsum('ij->ij', a[a[:,1].argsort(),:])
    

    This is an overkill for two dimensions and a[a[:,1].argsort()] would be enough per @steve's answer, however that answer cannot be generalized to higher dimensions. You can find an example of 3D array in this question.

    Output:

    [[7 0 5]
     [9 2 3]
     [4 5 6]]
    
    0 讨论(0)
提交回复
热议问题