i used pipeline and grid_search to select the best parameters and then used these parameters to fit the best pipeline (\'best_pipe\'). However since the feature_selection (Selec
You can access the feature selector by name in best_pipe
:
features = best_pipe.named_steps['feat']
Then you can call transform()
on an index array to get the names of the selected columns:
X.columns[features.transform(np.arange(len(X.columns)))]
The output here will be the eighty column names selected in the pipeline.
This could be an instructive alternative: I encountered a similar need as what was asked by OP. If one wants to get the k best features' indices directly from GridSearchCV
:
finalFeatureIndices = gs.best_estimator_.named_steps["feat"].get_support(indices=True)
And via index manipulation, can get your finalFeatureList
:
finalFeatureList = [initialFeatureList[i] for i in finalFeatureIndices]
Jake's answer totally works. But depending on what feature selector your using, there's another option that I think is cleaner. This one worked for me:
X.columns[features.get_support()]
It gave me an identical answer to Jake's answer. And you can see more about it in the docs, but get_support
returns an array of true/false values for whether or not the column was used. Also, it's worth noting that X
must be of identical shape to the training data used on the feature selector.