Mapping column names to random forest feature importances

后端 未结 4 1256
-上瘾入骨i
-上瘾入骨i 2021-02-04 12:28

I am trying to plot feature importances for a random forest model and map each feature importance back to the original coefficient. I\'ve managed to create a plot that shows the

4条回答
  •  一生所求
    2021-02-04 13:15

    A sort of generic solution would be to throw the features/importances into a dataframe and sort them before plotting:

    import pandas as pd
    %matplotlib inline
    #do code to support model
    #"data" is the X dataframe and model is the SKlearn object
    
    feats = {} # a dict to hold feature_name: feature_importance
    for feature, importance in zip(data.columns, model.feature_importances_):
        feats[feature] = importance #add the name/value pair 
    
    importances = pd.DataFrame.from_dict(feats, orient='index').rename(columns={0: 'Gini-importance'})
    importances.sort_values(by='Gini-importance').plot(kind='bar', rot=45)
    

提交回复
热议问题