LogisticRegression: Unknown label type: 'continuous' using sklearn in python

后端 未结 3 1895
轮回少年
轮回少年 2020-11-28 05:24

I have the following code to test some of most popular ML algorithms of sklearn python library:

import numpy as np
from sklearn                        import         


        
相关标签:
3条回答
  • 2020-11-28 05:44

    I struggled with the same issue when trying to feed floats to the classifiers. I wanted to keep floats and not integers for accuracy. Try using regressor algorithms. For example:

    import numpy as np
    from sklearn import linear_model
    from sklearn import svm
    
    classifiers = [
        svm.SVR(),
        linear_model.SGDRegressor(),
        linear_model.BayesianRidge(),
        linear_model.LassoLars(),
        linear_model.ARDRegression(),
        linear_model.PassiveAggressiveRegressor(),
        linear_model.TheilSenRegressor(),
        linear_model.LinearRegression()]
    
    trainingData    = np.array([ [2.3, 4.3, 2.5],  [1.3, 5.2, 5.2],  [3.3, 2.9, 0.8],  [3.1, 4.3, 4.0]  ])
    trainingScores  = np.array( [3.4, 7.5, 4.5, 1.6] )
    predictionData  = np.array([ [2.5, 2.4, 2.7],  [2.7, 3.2, 1.2] ])
    
    for item in classifiers:
        print(item)
        clf = item
        clf.fit(trainingData, trainingScores)
        print(clf.predict(predictionData),'\n')
    
    0 讨论(0)
  • 2020-11-28 05:48

    You are passing floats to a classifier which expects categorical values as the target vector. If you convert it to int it will be accepted as input (although it will be questionable if that's the right way to do it).

    It would be better to convert your training scores by using scikit's labelEncoder function.

    The same is true for your DecisionTree and KNeighbors qualifier.

    from sklearn import preprocessing
    from sklearn import utils
    
    lab_enc = preprocessing.LabelEncoder()
    encoded = lab_enc.fit_transform(trainingScores)
    >>> array([1, 3, 2, 0], dtype=int64)
    
    print(utils.multiclass.type_of_target(trainingScores))
    >>> continuous
    
    print(utils.multiclass.type_of_target(trainingScores.astype('int')))
    >>> multiclass
    
    print(utils.multiclass.type_of_target(encoded))
    >>> multiclass
    
    0 讨论(0)
  • 2020-11-28 05:53

    LogisticRegression is not for regression but classification !

    The Y variable must be the classification class,

    (for example 0 or 1)

    And not a continuous variable,

    that would be a regression problem.

    0 讨论(0)
提交回复
热议问题