问题
I am trying to generate the feature importance plot through Permutation Feature Importance
plot. I am trying to kind of make sure whether the features returned through different approaches is stable. To select optimal features. Can we get a p-value
or something of that sort which can indicate the feature is significant? If I could do it with PFI
, i could be more confident but looks like the results are entirely opposite
Here is my code to generate the plot
logreg=LogisticRegression(random_state=1) # i also tried with Random Forest
logreg.fit(X_train_std,y_train)
perm = PermutationImportance(logreg,random_state=1).fit(X_train_std,y_train)
eli5.show_weights(perm) #find the issue with plot below
Questions
1) The feature that I see at the top was non-significant in other approaches (Chi-square,Xgboost Feature importance, Logistic Regression stats model summary etc) but here i see it at the top which I am a bit shocked. Is it ordered in a decreasing order or ascending order?
2) I understand PFI
randomizes value to see the reduction in model error. If first row (X18
) is an important feature, then it's totally opposite of my other approaches. Am I making any mistake here? What should I be looking/checking in a situation like this? Or should I apply PFI
only on already selected important features?
3) How do I make the jupyter cell to display to all rows. Currently it doesn't show remaining 35 rows as shown below . I have already set pandas_set column width, rows etc
Can you help me with this?
回答1:
Use the attribute top=
to solve Questions 3, as in eli5.show_weights(perm,top=100)
. More in the docs.
For question 1 and 2, I've been in a similar situation. As far as I know, different approaches do have different outputs. Each approach has its own criteria. For TREE approaches, such as DecisionTree
, xgboost
, catboost
, GBRT
, etc., in the process of building a tree. The more a feature is used, the more important it becomes. But other approaches don't.
来源:https://stackoverflow.com/questions/59390688/how-to-interpret-and-view-the-complete-permutation-feature-plot-in-jupyter