Get column names for max values over a certain row in a pandas DataFrame

后端 未结 2 652
予麋鹿
予麋鹿 2021-01-14 09:39

In the DataFrame

import pandas as pd 
df=pd.DataFrame({\'col1\':[1,2,3],\'col2\':[3,2,1],\'col3\':[1,1,1]},index= [\'row1\',\'row2\',\'row3\'])
print df
             


        
2条回答
  •  执念已碎
    2021-01-14 10:11

    If not duplicates, you can use idxmax, but it return only first column of max value:

    print (df.idxmax(1))
    row1    col2
    row2    col1
    row3    col1
    dtype: object
    
    def get_column_name_for_max_values_of(row):
        return df.idxmax(1).ix[row]
    
    print (get_column_name_for_max_values_of('row2'))
    col1
    

    But with duplicates use boolean indexing:

    print (df.ix['row2'] == df.ix['row2'].max())
    col1     True
    col2     True
    col3    False
    Name: row2, dtype: bool
    
    print (df.ix[:,df.ix['row2'] == df.ix['row2'].max()])
          col1  col2
    row1     1     3
    row2     2     2
    row3     3     1
    
    print (df.ix[:,df.ix['row2'] == df.ix['row2'].max()].columns)
    Index(['col1', 'col2'], dtype='object')
    

    And function is:

    def get_column_name_for_max_values_of(row):
        return df.ix[:,df.ix[row] == df.ix[row].max()].columns.tolist()
    
    print (get_column_name_for_max_values_of('row2'))
    ['col1', 'col2']
    

提交回复
热议问题