what is the difference between class weight = none and auto in svm scikit learn

后端 未结 2 1914
执笔经年
执笔经年 2021-01-06 02:11

In scikit learn svm classifier what is the difference between class_weight = None and class_weight = Auto.

From the documentation it is given as

相关标签:
2条回答
  • 2021-01-06 02:13

    This is quite an old post, but for all those who've just encountered this problem, note that class_weight == 'auto' has been deprecated as of version 0.17. Use class_weight == 'balanced' instead.

    http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html

    This is implemented as follows:

    n_samples / (n_classes * np.bincount(y))

    Cheers!

    0 讨论(0)
  • 2021-01-06 02:15

    This takes place in the class_weight.py file:

    elif class_weight == 'auto':
        # Find the weight of each class as present in y.
        le = LabelEncoder()
        y_ind = le.fit_transform(y)
        if not all(np.in1d(classes, le.classes_)):
            raise ValueError("classes should have valid labels that are in y")
    
        # inversely proportional to the number of samples in the class
        recip_freq = 1. / bincount(y_ind)
        weight = recip_freq[le.transform(classes)] / np.mean(recip_freq)
    

    This means that each class you have (in classes) gets a weight equal to 1 divided by the number of times that class appears in your data (y), so classes that appear more often will get lower weights. This is then further divided by the mean of all the inverse class frequencies.

    The advantage is that you no longer have to worry about setting the class weights yourself: this should already be good for most applications.

    If you look above in the source code, for None, weight is filled with ones, so each class gets equal weight.

    0 讨论(0)
提交回复
热议问题