sklearn classifier get ValueError: bad input shape

后端未结

关注

 2  1030

I have a csv, struct is CAT1,CAT2,TITLE,URL,CONTENT, CAT1, CAT2, TITLE ,CONTENT are in chinese.

I want train LinearSVC or Multinomial


                      
              相关标签:


      
      
        
          2条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  北荒        
                
              
                            
                2021-01-17 10:36
              
            
            
                                                                       
Thanks to @meelo, I solved this problem.
As he said: in my code, data is a feature vector, target is target value. I mixed up two things.

I learned that TfidfVectorizer processes data to [data, feature], and each data should map to just one target.

If I want to predict two type targets, I need two distinct targets:


target_C1 with all C1 value
target_C2 with all C2 value.


Then use the two targets and original data to train two classifier for each target. 
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  盖世英雄少女心        
                
              
                            
                2021-01-17 10:56
              
            
            
                                                                       
I had the same issue.

So if you are facing the same problem you should check the shape of clf.fit(X,y)parameters:

X : Training vector {array-like, sparse matrix}, shape (n_samples, n_features).

y : Target vector relative to X array-like, shape (n_samples,).

as you can see the y width should be 1, to make sure your target vector is shaped correctly try command

y.shape


should be (n_samples,)

In my case, for my training vector I was concatenating 3 separate vectors from 3 different vectorizers to use all as my final training vector.
The problem was that each vector had the ['Label'] column in it so the final training vector contained 3 ['Label'] columns.
Then when I used final_trainingVect['Label'] as my Target vector it's shape was n_samples,3).
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复