Getting TypeError: reduction operation 'argmax' not allowed for this dtype when trying to use idxmax()

前端未结

关注

 4  1518

When using the idxmax() function in Pandas, I keep receiving this error.

Traceback (most recent call last):
  File \"/Users/username/College/yea


                      
              相关标签:


      
      
        
          4条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  暗喜        
                
              
                            
                2021-01-07 23:13
              
            
            
                                                                       
The type of the cell values are, by default, non-numeric. argmin(), idxmin(), argmax() and other similar functions need the dtypes to be numeric.

The easiest solution is to use pd.to_numeric() in order to convert your series (or columns) to numeric types. An example with a data frame df with a column 'a' would be:

df['a'] = pd.to_numeric(df['a'])

A more complete answer on type casting on pandas can be found here.

Hope that helps :)
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  太阳男子        
                
              
                            
                2021-01-07 23:25
              
            
            
                                                                       
In short, try this

best_c = results_table.loc[results_table['Mean recall score'].astype(float).idxmax()]['C_parameter']


instead of

best_c = results_table.loc[results_table['Mean recall score'].idxmax()]['C_parameter']

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  难免孤独        
                
              
                            
                2021-01-07 23:30
              
            
            
                                                                       
#best_c = results_table.loc[results_table['Mean recall score'].idxmax()]['C_parameter']


We should replace this line of code

The main problem:

1) the type of "mean recall score" is object, you can't use "idxmax()" to calculate the value 
2) you should change "mean recall score" from "object " to "float" 
3) you can use apply(pd.to_numeric, errors = 'coerce', axis = 0) to do such things. 

best_c = results_table
best_c.dtypes.eq(object) # you can see the type of best_c
new = best_c.columns[best_c.dtypes.eq(object)] #get the object column of the best_c
best_c[new] = best_c[new].apply(pd.to_numeric, errors = 'coerce', axis=0) # change the type of object
best_c
best_c = results_table.loc[results_table['Mean recall score'].idxmax()]['C_parameter'] #calculate the mean values

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  既然无缘        
                
              
                            
                2021-01-07 23:33
              
            
            
                                                                       
If NaN are present (and we can sort of see this by the stack trace) then when you think you are working with a data frame of numerics, you could well have mixed types, and in particular, a string among numerics.  Let me give you 3 code examples, the first 2 work, the last doesn't and is likely your case.

This represents all numeric data, it will work with idxmax

the_dict = {}
the_dict['a'] = [0.1, 0.2, 0.5]
the_dict['b'] = [0.3, 0.4, 0.6]
the_dict['c'] = [0.25, 0.3, 0.9]
the_dict['d'] = [0.2, 0.1, 0.4]
the_df = pd.DataFrame(the_dict)


This represents a numeric nan, it will work idxmax

the_dict = {}
the_dict['a'] = [0.1, 0.2, 0.5]
the_dict['b'] = [0.3, 0.4, 0.6]
the_dict['c'] = [0.25, 0.3, 0.9]
the_dict['d'] = [0.2, 0.1, np.NaN]
the_df = pd.DataFrame(the_dict)


This could be the exact problem reported by the OP, but if it turns out we have mixed types in any fashion, we will get the error the OP reported.

the_dict = {}
the_dict['a'] = [0.1, 0.2, 0.5]
the_dict['b'] = [0.3, 0.4, 0.6]
the_dict['c'] = [0.25, 0.3, 0.9]
the_dict['d'] = [0.2, 0.1, 'NaN']
the_df = pd.DataFrame(the_dict)

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复