Use of scikit Random Forest sample_weights

前端 未结 1 417
无人及你
无人及你 2021-02-06 06:25

I\'ve been trying to figure out scikit\'s Random Forest sample_weight use and I cannot explain some of the results I\'m seeing. Fundamentally I need it to balance a classificati

相关标签:
1条回答
  • 2021-02-06 06:47

    With the Random Forest algorithm, there is, as the name implies, some "Random"ness to it.

    You are getting different F1 score because the Random Forest Algorithm (RFA) is using a subset of your data to generate the decision trees, and then averaging across all of your trees. I am not surprised, therefore, that you have similar (but non-identical) F1 scores for each of your runs.

    I have tried balancing the weights before. You may want to try balancing the weights by the size of each class in the population. For example, if you were to have two classes as such:

    Class A: 5 members
    Class B: 2 members
    

    You may wish to balance the weights by assigning 2/7 for each of Class A's members and 5/7 for each of Class B's members. That's just an idea as a starting place, though. How you weight your classes will depend on the problem you have.

    0 讨论(0)
提交回复
热议问题