问题
I have a correlation matrix like so
a b c
a 1 0.5 0.3
b 0.5 1 0.7
c 0.3 0.7 1
And I want to transform this into a dataframe where the columns are like this:
Letter1 letter2 correlation
a a 1
a b 0.5
a c 0.3
b a 0.5
b b 1
. . .
. . .
Is there a pandas command to allow me to do this? Thanks in advance
And a follow up to this, can I assign a value to the letters in Letter1 like so:
Value1 Letter1 Value2 letter2 correlation
1 a 1 a 1
1 a 2 b 0.5
1 a 3 c 0.3
2 b 1 a 0.5
2 b 2 b 1
. . . . .
. . . . .
回答1:
Use stack with reset_index:
df1 = df.stack().reset_index()
df1.columns = ['Letter1','Letter2','correlation']
print (df1)
Letter1 Letter2 correlation
0 a a 1.0
1 a b 0.5
2 a c 0.3
3 b a 0.5
4 b b 1.0
5 b c 0.7
6 c a 0.3
7 c b 0.7
8 c c 1.0
And then insert columns by positions filled by factorizeed values:
df1.insert(0, 'Value1', pd.factorize(df1['Letter1'])[0] + 1)
df1.insert(2, 'Value2', pd.factorize(df1['Letter2'])[0] + 1)
print (df1)
Value1 Letter1 Value2 Letter2 correlation
0 1 a 1 a 1.0
1 1 a 2 b 0.5
2 1 a 3 c 0.3
3 2 b 1 a 0.5
4 2 b 2 b 1.0
5 2 b 3 c 0.7
6 3 c 1 a 0.3
7 3 c 2 b 0.7
8 3 c 3 c 1.0
来源:https://stackoverflow.com/questions/53469212/transforming-a-correlation-matrix-to-a-3-column-dataframe-in-pandas