'DataFrame' object has no attribute 'sort'

后端 未结 2 1425
慢半拍i
慢半拍i 2020-11-28 05:19

I face some problem here, in my python package I have install numpy, but I still have this error \'DataFrame\' object has no attribute \'sort\'

Anyo

相关标签:
2条回答
  • 2020-11-28 05:30

    Pandas Sorting 101

    sort has been replaced in v0.20 by DataFrame.sort_values and DataFrame.sort_index. Aside from this, we also have argsort.

    Here are some common use cases in sorting, and how to solve them using the sorting functions in the current API. First, the setup.

    # Setup
    np.random.seed(0)
    df = pd.DataFrame({'A': list('accab'), 'B': np.random.choice(10, 5)})    
    df                                                                                                                                        
       A  B
    0  a  7
    1  c  9
    2  c  3
    3  a  5
    4  b  2
    

    Sort by Single Column

    For example, to sort df by column "A", use sort_values with a single column name:

    df.sort_values(by='A')
    
       A  B
    0  a  7
    3  a  5
    4  b  2
    1  c  9
    2  c  3
    

    If you need a fresh RangeIndex, use DataFrame.reset_index.

    Sort by Multiple Columns

    For example, to sort by both col "A" and "B" in df, you can pass a list to sort_values:

    df.sort_values(by=['A', 'B'])
    
       A  B
    3  a  5
    0  a  7
    4  b  2
    2  c  3
    1  c  9
    

    Sort By DataFrame Index

    df2 = df.sample(frac=1)
    df2
    
       A  B
    1  c  9
    0  a  7
    2  c  3
    3  a  5
    4  b  2
    

    You can do this using sort_index:

    df2.sort_index()
    
       A  B
    0  a  7
    1  c  9
    2  c  3
    3  a  5
    4  b  2
    
    df.equals(df2)                                                                                                                            
    # False
    df.equals(df2.sort_index())                                                                                                               
    # True
    

    Here are some comparable methods with their performance:

    %timeit df2.sort_index()                                                                                                                  
    %timeit df2.iloc[df2.index.argsort()]                                                                                                     
    %timeit df2.reindex(np.sort(df2.index))                                                                                                   
    
    605 µs ± 13.6 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
    610 µs ± 24.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
    581 µs ± 7.63 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
    

    Sort by List of Indices

    For example,

    idx = df2.index.argsort()
    idx
    # array([0, 7, 2, 3, 9, 4, 5, 6, 8, 1])
    

    This "sorting" problem is actually a simple indexing problem. Just passing integer labels to iloc will do.

    df.iloc[idx]
    
       A  B
    1  c  9
    0  a  7
    2  c  3
    3  a  5
    4  b  2
    
    0 讨论(0)
  • 2020-11-28 05:46

    sort() was deprecated for DataFrames in favor of either:

    • sort_values() to sort by column(s)
    • sort_index() to sort by the index

    sort() was deprecated (but still available) in Pandas with release 0.17 (2015-10-09) with the introduction of sort_values() and sort_index(). It was removed from Pandas with release 0.20 (2017-05-05).

    0 讨论(0)
提交回复
热议问题