How to calculate p-values for pairwise correlation of columns in Pandas?

后端 未结 4 1015
执念已碎
执念已碎 2021-02-09 11:01

Pandas has the very handy function to do pairwise correlation of columns using pd.corr(). That means it is possible to compare correlations between columns of any length. For in

4条回答
  •  栀梦
    栀梦 (楼主)
    2021-02-09 11:53

    Why not using the "method" argument of pandas.DataFrame.corr():

    • pearson : standard correlation coefficient.
    • kendall : Kendall Tau correlation coefficient.
    • spearman : Spearman rank correlation.
    • callable: callable with input two 1d ndarrays and returning a float.
    from scipy.stats import kendalltau, pearsonr, spearmanr
    
        def kendall_pval(x,y):
            return kendalltau(x,y)[1]
        
        def pearsonr_pval(x,y):
            return pearsonr(x,y)[1]
        
        def spearmanr_pval(x,y):
            return spearmanr(x,y)[1]
    

    and then

    corr = df.corr(method=pearsonr_pval)
    

提交回复
热议问题