Sklearn Chi2 For Feature Selection

风格不统一 提交于 2019-12-04 07:36:57

Your understanding is reversed.

The null hypothesis for chi2 test is that "two categorical variables are independent". So a higher value of chi2 statistic means "two categorical variables are dependent" and MORE USEFUL for classification.

SelectKBest gives you the best two (k=2) features based on higher chi2 values. Thus you need to get those features that it gives, rather that getting the "other features" on the chi2 selector.

You are correct to get the chi2 statistic from chi2_selector.scores_ and the best features from chi2_selector.get_support(). It will give you 'petal length (cm)' and 'petal width (cm)' as top 2 features based on chi2 test of independence test. Hope it clarifies this algorithm.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!