List Highest Correlation Pairs from a Large Correlation Matrix in Pandas?

后端 未结 13 459
心在旅途
心在旅途 2020-12-22 17:45

How do you find the top correlations in a correlation matrix with Pandas? There are many answers on how to do this with R (Show correlations as an ordered list, not as a lar

相关标签:
13条回答
  • 2020-12-22 18:34

    This is a improve code from @MiFi. This one order in abs but not excluding the negative values.

       def top_correlation (df,n):
        corr_matrix = df.corr()
        correlation = (corr_matrix.where(np.triu(np.ones(corr_matrix.shape), k=1).astype(np.bool))
                     .stack()
                     .sort_values(ascending=False))
        correlation = pd.DataFrame(correlation).reset_index()
        correlation.columns=["Variable_1","Variable_2","Correlacion"]
        correlation = correlation.reindex(correlation.Correlacion.abs().sort_values(ascending=False).index).reset_index().drop(["index"],axis=1)
        return correlation.head(n)
    
    top_correlation(ANYDATA,10)
    
    0 讨论(0)
提交回复
热议问题