transition matrix for counts and proportions python

橙三吉。 提交于 2019-12-13 03:23:50

问题


I have a matrix with the grades from a class for different years(rows for years and columns for grades). What I want is to build a transition matrix with the change between years.

For instance, I want year t-1 on the y-axis and year t on the x-axis and then I want a transition matrix with the difference in the number of people with grade A between year t-1 and t, grade B between year t-1 and t, and so on. And then a second transition matrix with the proportions, for example: - Between year t-1 and t there z% more/less people with grade A/B/C/D/F.

Obviously the moest import part is the diagonal which would represent the change for the same grade for different years.

I want this to be done in Python.

Thank you very much, I hope everything is clear.

Result example: enter image description here


回答1:


You can use pandas library with df.diff. numpy can generate the matrix of all possible differences using np.subtract.outer. below is an example.

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
years = ['2015', '2016', '2017']
grades = ['A', 'B', 'C', 'D']

df = pd.DataFrame(np.random.randint(0, 10, (3, 4)), columns=grades, index=years)

print(df)

      A  B  C  D
2015  5  0  2  0
2016  7  2  0  2
2017  3  7  6  7

df_diff = df.diff(axis=0)
print(df_diff)

each row here in df_diff is the difference between current row and the preceding one from original df

        A        B     C     D
2015    NaN     NaN   NaN   NaN
2016    2.0     2.0   -2.0  2.0
2017    -4.0    5.0   6.0   5.0

a = np.array([])
differences = []
for i, y in enumerate(years):
    for j, g in enumerate(grades):
        differences.append(y+g)
        a = np.append(a, df.iloc[i,j])

df3 = pd.DataFrame(np.subtract.outer(a, a), columns=differences, index=differences)
print(df3)

      2015A   2015B  2015C  2015D   2016A   2016B   2016C   2016D   2017A   2017B   2017C   2017D
2015A   0.0     5.0  3.0    5.0 -2.0    3.0     5.0 3.0      2.0    -2.0    -1.0    -2.0
2015B   -5.0    0.0 -2.0    0.0 -7.0    -2.0    0.0 -2.0    -3.0    -7.0    -6.0    -7.0
2015C   -3.0    2.0  0.0    2.0 -5.0    0.0     2.0 0.0     -1.0    -5.0    -4.0    -5.0
2015D   -5.0    0.0 -2.0    0.0 -7.0    -2.0    0.0 -2.0    -3.0    -7.0    -6.0    -7.0
2016A   2.0     7.0 5.0     7.0  0.0    5.0     7.0  5.0    4.0     0.0   1.0       0.0
2016B   -3.0    2.0 0.0     2.0 -5.0    0.0     2.0 0.0    -1.0    -5.0  -4.0   -5.0
2016C   -5.0    0.0 -2.0    0.0 -7.0    -2.0    0.0 -2.0   -3.0    -7.0  -6.0   -7.0
2016D   -3.0    2.0 0.0     2.0 -5.0    0.0     2.0 0.0    -1.0     -5.0    -4.0    -5.0
2017A   -2.0    3.0 1.0     3.0 -4.0    1.0     3.0 1.0     0.0    -4.0  -3.0   -4.0
2017B   2.0     7.0 5.0     7.0 0.0     5.0     7.0 5.0     4.0     0.0     1.0     0.0
2017C   1.0     6.0 4.0     6.0 -1.0    4.0     6.0 4.0     3.0    -1.0   0.0     -1.0
2017D   2.0     7.0 5.0     7.0 0.0     5.0     7.0 5.0     4.0     0.0   1.0 0.0

plot this matrix using matshow from matplotlib

plt.matshow(df3)
plt.colorbar()
plt.show()



来源:https://stackoverflow.com/questions/52682226/transition-matrix-for-counts-and-proportions-python

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!