Getting features in RFECV scikit-learn

百般思念 提交于 2019-12-11 01:56:10

问题


Inspired by this: http://scikit-learn.org/stable/auto_examples/feature_selection/plot_rfe_with_cross_validation.html#sphx-glr-auto-examples-feature-selection-plot-rfe-with-cross-validation-py

I am wondering if there is anyway to get the features for a particular score:

In that case, I would like to know, which 10 features selected gives that peak when #Features = 10.

Any ideas?

EDIT:

This is the code used to get that plot:

from sklearn.feature_selection import RFECV
from sklearn.model_selection import KFold,StratifiedKFold #for K-fold cross validation
from sklearn.ensemble import RandomForestClassifier #Random Forest

# The "accuracy" scoring is proportional to the number of correct classifications
#kfold = StratifiedKFold(n_splits=10, random_state=1) # k=10, split the data into 10 equal parts
model_Linear_SVM=svm.SVC(kernel='linear', probability=True)
rfecv = RFECV(estimator=model_Linear_SVM, step=1, cv=kfold,scoring='accuracy')   #5-fold cross-validation
rfecv = rfecv.fit(X, y)

print('Optimal number of features :', rfecv.n_features_)
print('Best features :', X.columns[rfecv.support_])
print('Original features :', X.columns)
plt.figure()
plt.xlabel("Number of features selected")
plt.ylabel("Cross validation score \n of number of selected features")
plt.plot(range(1, len(rfecv.grid_scores_) + 1), rfecv.grid_scores_)
plt.show()

回答1:


First, you can see which features it selected where the cross validation score is the largest (in your case this corresponds to the number of features 17 or 21, I am not sure from the figure) with

rfecv.support_

or

rfecv.ranking_ 

Then you can calculate the importances of selected features (for the peak of the cv score curve) by

np.absolute(rfecv.estimator_.coef_)

for simple estimators or

rfecv.estimator_.feature_importances_ 

if your estimator is some ensemble, like random forest.

Then you can remove the least important feature one by one in the loop, and recalculate rfecv for the remaining feature sets.



来源:https://stackoverflow.com/questions/50387089/getting-features-in-rfecv-scikit-learn

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!