Getting TypeError: reduction operation 'argmax' not allowed for this dtype when trying to use idxmax()

前端 未结 4 1517
[愿得一人]
[愿得一人] 2021-01-07 22:34

When using the idxmax() function in Pandas, I keep receiving this error.

Traceback (most recent call last):
  File \"/Users/username/College/yea         


        
相关标签:
4条回答
  • 2021-01-07 23:13

    The type of the cell values are, by default, non-numeric. argmin(), idxmin(), argmax() and other similar functions need the dtypes to be numeric.

    The easiest solution is to use pd.to_numeric() in order to convert your series (or columns) to numeric types. An example with a data frame df with a column 'a' would be:

    df['a'] = pd.to_numeric(df['a'])

    A more complete answer on type casting on pandas can be found here.

    Hope that helps :)

    0 讨论(0)
  • 2021-01-07 23:25

    In short, try this

    best_c = results_table.loc[results_table['Mean recall score'].astype(float).idxmax()]['C_parameter']
    

    instead of

    best_c = results_table.loc[results_table['Mean recall score'].idxmax()]['C_parameter']
    
    0 讨论(0)
  • 2021-01-07 23:30
    #best_c = results_table.loc[results_table['Mean recall score'].idxmax()]['C_parameter']
    

    We should replace this line of code

    The main problem:

    1) the type of "mean recall score" is object, you can't use "idxmax()" to calculate the value 2) you should change "mean recall score" from "object " to "float" 3) you can use apply(pd.to_numeric, errors = 'coerce', axis = 0) to do such things.

    best_c = results_table
    best_c.dtypes.eq(object) # you can see the type of best_c
    new = best_c.columns[best_c.dtypes.eq(object)] #get the object column of the best_c
    best_c[new] = best_c[new].apply(pd.to_numeric, errors = 'coerce', axis=0) # change the type of object
    best_c
    best_c = results_table.loc[results_table['Mean recall score'].idxmax()]['C_parameter'] #calculate the mean values
    
    0 讨论(0)
  • 2021-01-07 23:33

    If NaN are present (and we can sort of see this by the stack trace) then when you think you are working with a data frame of numerics, you could well have mixed types, and in particular, a string among numerics. Let me give you 3 code examples, the first 2 work, the last doesn't and is likely your case.

    This represents all numeric data, it will work with idxmax

    the_dict = {}
    the_dict['a'] = [0.1, 0.2, 0.5]
    the_dict['b'] = [0.3, 0.4, 0.6]
    the_dict['c'] = [0.25, 0.3, 0.9]
    the_dict['d'] = [0.2, 0.1, 0.4]
    the_df = pd.DataFrame(the_dict)
    

    This represents a numeric nan, it will work idxmax

    the_dict = {}
    the_dict['a'] = [0.1, 0.2, 0.5]
    the_dict['b'] = [0.3, 0.4, 0.6]
    the_dict['c'] = [0.25, 0.3, 0.9]
    the_dict['d'] = [0.2, 0.1, np.NaN]
    the_df = pd.DataFrame(the_dict)
    

    This could be the exact problem reported by the OP, but if it turns out we have mixed types in any fashion, we will get the error the OP reported.

    the_dict = {}
    the_dict['a'] = [0.1, 0.2, 0.5]
    the_dict['b'] = [0.3, 0.4, 0.6]
    the_dict['c'] = [0.25, 0.3, 0.9]
    the_dict['d'] = [0.2, 0.1, 'NaN']
    the_df = pd.DataFrame(the_dict)
    
    0 讨论(0)
提交回复
热议问题