return n smallest indexes by column using pandas

后端 未结 3 1647
被撕碎了的回忆
被撕碎了的回忆 2020-12-31 07:26

I have the following (simplified) dataframe:

df = pd.DataFrame({\'X\': [1, 2, 3, 4, 5,6,7,8,9,10],
\'Y\': [10,20,30,40,50,-10,-20,-30,-40,-50],
\'Z\': [20,18         


        
3条回答
  •  囚心锁ツ
    2020-12-31 08:20

    Faster numpy solution with numpy.argsort:

    N = 3
    a = np.argsort(-df.values, axis=0)[-1:-1-N:-1]
    print (a)
    [[0 9 9]
     [1 8 8]
     [2 7 7]]
    
    b = pd.DataFrame(df.index[a], columns=df.columns)
    print (b)
       X  Y  Z
    0  A  J  J
    1  B  I  I
    2  C  H  H
    

    Timings:

    In [111]: %timeit (pd.DataFrame(df.index[np.argsort(-df.values, axis=0)[-1:-1-N:-1]], columns=df.columns))
    159 µs ± 1.37 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
    
    In [112]: %timeit (df.apply(lambda x: pd.Series(x.nsmallest(N).index)))
    3.52 ms ± 49.7 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
    

提交回复
热议问题