LabelEncoder: TypeError: '>' not supported between instances of 'float' and 'str'

前端未结

关注

 3  1370

I\'m facing this error for multiple variables even treating missing values. For example:

le = preprocessing.LabelEncoder()
categorical = list(df.select_dtypes(in


                      
              相关标签:


      
      
        
          3条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  旧巷少年郎        
                
              
                            
                2021-01-30 00:32
              
            
            
                                                                       
As string data types have variable length, it is by default stored as object type. I faced this problem after treating missing values too. Converting all those columns to type 'category' before label encoding worked in my case.

df[cat]=df[cat].astype('category')


And then check df.dtypes and perform label encoding.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  野的像风        
                
              
                            
                2021-01-30 00:49
              
            
            
                                                                       
Or use a cast with split to uniform type of str

unique, counts = numpy.unique(str(a).split(), return_counts=True)

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  忘掉有多难        
                
              
                            
                2021-01-30 00:50
              
            
            
                                                                       
This is due to the series df[cat] containing elements that have varying data types e.g.(strings and/or floats).  This could be due to the way the data is read, i.e. numbers are read as float and text as strings or the datatype was float and changed after the fillna operation.

In other words 


  pandas data type 'Object' indicates mixed types rather than str type


so using the following line:


df[cat] = le.fit_transform(df[cat].astype(str))



should help
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复