How to correlate an Ordinal Categorical column in pandas?

后端 未结 3 1632
长情又很酷
长情又很酷 2021-01-31 19:13

I have a DataFrame df with a non-numerical column CatColumn.

   A         B         CatColumn
0  381.1396  7.343921  Medium
1  481.3268         


        
3条回答
  •  长情又很酷
    2021-01-31 19:51

    The right way to correlate a categorical column with N values is to split this column into N separate boolean columns.

    Lets take the original question dataframe. Make the category columns:

    for i in df.CatColumn.astype('category'):
        df[i] = df.CatColumn == i
    

    Then it is possible to calculate the correlation between every category and other columns:

    df.corr()
    

    Output:

                        A         B    Medium      High  Medium-High
    A            1.000000  0.490608  0.914322 -0.312309    -0.743459
    B            0.490608  1.000000  0.343620  0.548589    -0.945367
    Medium       0.914322  0.343620  1.000000 -0.577350    -0.577350
    High        -0.312309  0.548589 -0.577350  1.000000    -0.333333
    Medium-High -0.743459 -0.945367 -0.577350 -0.333333     1.000000
    

提交回复
热议问题