LabelEncoder: TypeError: '>' not supported between instances of 'float' and 'str'

前端 未结 3 1370
一生所求
一生所求 2021-01-30 00:25

I\'m facing this error for multiple variables even treating missing values. For example:

le = preprocessing.LabelEncoder()
categorical = list(df.select_dtypes(in         


        
相关标签:
3条回答
  • 2021-01-30 00:32

    As string data types have variable length, it is by default stored as object type. I faced this problem after treating missing values too. Converting all those columns to type 'category' before label encoding worked in my case.

    df[cat]=df[cat].astype('category')
    

    And then check df.dtypes and perform label encoding.

    0 讨论(0)
  • 2021-01-30 00:49

    Or use a cast with split to uniform type of str

    unique, counts = numpy.unique(str(a).split(), return_counts=True)
    
    0 讨论(0)
  • 2021-01-30 00:50

    This is due to the series df[cat] containing elements that have varying data types e.g.(strings and/or floats). This could be due to the way the data is read, i.e. numbers are read as float and text as strings or the datatype was float and changed after the fillna operation.

    In other words

    pandas data type 'Object' indicates mixed types rather than str type

    so using the following line:

    df[cat] = le.fit_transform(df[cat].astype(str))
    


    should help

    0 讨论(0)
提交回复
热议问题