发表新帖

发表新帖

scikit-learn: Random forest class_weight and sample_weight parameters

后端未结

关注

 1  795

死守一世寂寞

I have a class imbalance problem and been experimenting with a weighted Random Forest using the implementation in scikit-learn (>= 0.16).

I have noticed that the implem

相关标签:

1条回答

猫巷女王i

2021-01-31 19:45
RandomForests are built on Trees, which are very well documented. Check how Trees use the sample weighting:
- User guide on decision trees - tells exactly what algorithm is used
- Decision tree API - explains how sample_weight is used by trees (which for random forests, as you have determined, is the product of class_weight and sample_weight).
As for the difference between class_weight and sample_weight: much can be determined simply by the nature of their datatypes. sample_weight is 1D array of length n_samples, assigning an explicit weight to each example used for training. class_weight is either a dictionary of each class to a uniform weight for that class (e.g., {1:.9, 2:.5, 3:.01}), or is a string telling sklearn how to automatically determine this dictionary.

So the training weight for a given example is the product of it's explicitly named sample_weight (or 1 if sample_weight is not provided), and it's class_weight (or 1 if class_weight is not provided).
0 讨论(0)
发布评论:

提交评论
- 加载中...

热议问题