sklearn classifier get ValueError: bad input shape

后端 未结 2 1030
佛祖请我去吃肉
佛祖请我去吃肉 2021-01-17 09:59

I have a csv, struct is CAT1,CAT2,TITLE,URL,CONTENT, CAT1, CAT2, TITLE ,CONTENT are in chinese.

I want train LinearSVC or Multinomial

相关标签:
2条回答
  • 2021-01-17 10:36

    Thanks to @meelo, I solved this problem. As he said: in my code, data is a feature vector, target is target value. I mixed up two things.

    I learned that TfidfVectorizer processes data to [data, feature], and each data should map to just one target.

    If I want to predict two type targets, I need two distinct targets:

    1. target_C1 with all C1 value
    2. target_C2 with all C2 value.

    Then use the two targets and original data to train two classifier for each target.

    0 讨论(0)
  • 2021-01-17 10:56

    I had the same issue.

    So if you are facing the same problem you should check the shape of clf.fit(X,y)parameters:

    X : Training vector {array-like, sparse matrix}, shape (n_samples, n_features).

    y : Target vector relative to X array-like, shape (n_samples,).

    as you can see the y width should be 1, to make sure your target vector is shaped correctly try command

    y.shape
    

    should be (n_samples,)

    In my case, for my training vector I was concatenating 3 separate vectors from 3 different vectorizers to use all as my final training vector. The problem was that each vector had the ['Label'] column in it so the final training vector contained 3 ['Label'] columns. Then when I used final_trainingVect['Label'] as my Target vector it's shape was n_samples,3).

    0 讨论(0)
提交回复
热议问题