sklearn logistic regression with unbalanced classes

后端 未结 2 1390
醉酒成梦
醉酒成梦 2021-01-31 10:42

I\'m solving a classification problem with sklearn\'s logistic regression in python.

My problem is a general/generic one. I have a dataset with two classes/result (posi

相关标签:
2条回答
  • 2021-01-31 10:58

    Have you tried to pass to your class_weight="auto" classifier? Not all classifiers in sklearn support this, but some do. Check the docstrings.

    Also you can rebalance your dataset by randomly dropping negative examples and / or over-sampling positive examples (+ potentially adding some slight gaussian feature noise).

    0 讨论(0)
  • 2021-01-31 11:04

    @agentscully Have you read the following paper,

    [SMOTE] (https://www.jair.org/media/953/live-953-2037-jair.pdf). I have found the same very informative. Here is the link to the Repo. Depending on how you go about balancing your target classes, either you can use

    • 'auto': (is deprecated in the newer version 0.17) or 'balanced' or specify the class ratio yourself {0: 0.1, 1: 0.9}.
    • 'balanced': This mode adjusts the weights inversely proportional to class frequencies n_samples / (n_classes * np.bincount(y)

    Let me know, if more insight is needed.

    0 讨论(0)
提交回复
热议问题