SKLearn how to get decision probabilities for LinearSVC classifier

前端 未结 2 1484
一整个雨季
一整个雨季 2020-12-17 01:52

I am using scikit-learn\'s linearSVC classifier for text mining. I have the y value as a label 0/1 and the X value as the TfidfVectorizer of the text document.

I us

相关标签:
2条回答
  • 2020-12-17 02:17

    You can't. However you can use sklearn.svm.SVC with kernel='linear' and probability=True

    It may run longer, but you can get probabilities from this classifier by using predict_proba method.

    clf=sklearn.svm.SVC(kernel='linear',probability=True)
    clf.fit(X,y)
    clf.predict_proba(X_test)
    
    0 讨论(0)
  • 2020-12-17 02:32

    If you insist on using the LinearSVC class, you can wrap it in a sklearn.calibration.CalibratedClassifierCV object and fit the calibrated classifier which will give you a probabilistic classifier.

    from sklearn.svm import LinearSVC
    from sklearn.calibration import CalibratedClassifierCV
    from sklearn import datasets
    
    #Load iris dataset
    iris = datasets.load_iris()
    X = iris.data[:, :2] # Using only two features
    y = iris.target      #3 classes: 0, 1, 2
    
    linear_svc = LinearSVC()     #The base estimator
    
    # This is the calibrated classifier which can give probabilistic classifier
    calibrated_svc = CalibratedClassifierCV(linear_svc,
                                            method='sigmoid',  #sigmoid will use Platt's scaling. Refer to documentation for other methods.
                                            cv=3) 
    calibrated_svc.fit(X, y)
    
    
    # predict
    prediction_data = [[2.3, 5],
                       [4, 7]]
    predicted_probs = calibrated_svc.predict_proba(prediction_data)  #important to use predict_proba
    print predicted_probs
    

    Here is the output:

    [[  9.98626760e-01   1.27594869e-03   9.72912751e-05]
     [  9.99578199e-01   1.79053170e-05   4.03895759e-04]]
    

    which shows probabilities for each class for each data point.

    0 讨论(0)
提交回复
热议问题