Find the column name which has the maximum value for each row

后端 未结 3 1907
隐瞒了意图╮
隐瞒了意图╮ 2020-11-22 10:30

I have a DataFrame like this one:

In [7]:
frame.head()
Out[7]:
Communications and Search   Business    General Lifestyle
0   0.745763    0.050847    0.118644         


        
相关标签:
3条回答
  • 2020-11-22 11:07

    You can use idxmax with axis=1 to find the column with the greatest value on each row:

    >>> df.idxmax(axis=1)
    0    Communications
    1          Business
    2    Communications
    3    Communications
    4          Business
    dtype: object
    

    To create the new column 'Max', use df['Max'] = df.idxmax(axis=1).

    To find the row index at which the maximum value occurs in each column, use df.idxmax() (or equivalently df.idxmax(axis=0)).

    0 讨论(0)
  • 2020-11-22 11:27

    And if you want to produce a column containing the name of the column with the maximum value but considering only a subset of columns then you use a variation of @ajcr's answer:

    df['Max'] = df[['Communications','Business']].idxmax(axis=1)
    
    0 讨论(0)
  • 2020-11-22 11:28

    You could apply on dataframe and get argmax() of each row via axis=1

    In [144]: df.apply(lambda x: x.argmax(), axis=1)
    Out[144]:
    0    Communications
    1          Business
    2    Communications
    3    Communications
    4          Business
    dtype: object
    

    Here's a benchmark to compare how slow apply method is to idxmax() for len(df) ~ 20K

    In [146]: %timeit df.apply(lambda x: x.argmax(), axis=1)
    1 loops, best of 3: 479 ms per loop
    
    In [147]: %timeit df.idxmax(axis=1)
    10 loops, best of 3: 47.3 ms per loop
    
    0 讨论(0)
提交回复
热议问题