My task is to create a program that simulates a discrete time Markov Chain, for an arbitrary number of events. However, right now the part I'm struggling with is creating the right stochastic matrix that will represent the probabilities. A right stochastic matrix is a matrix that has row entries that sum to 1. And for a given size, I kind of know how to write the matrix that does that, however, the problem is that I don't know how to do that for an arbitrary size.
For example: here is my code for a 3x3 matrix, along with example of the output I was getting.
However, my code doesn't work every time -- there are certain times when the third entry in the row is negative because the first two are too large. And I don't know how to get around that, as far as I know, there isn't a function in Python that makes it so that you can generate random numbers that sum to something in particular.
Any help is appreciated.
(Note that this isn't a homework problem, it's only for extra credit in my Math class and the professor doesn't mind the use of outside sources.)
Using @MBo's idea:
In [16]: matrix = np.random.rand(3,3)
In [17]: matrix/matrix.sum(axis=1)[:,None]
Out[17]:
array([[ 0.25429337, 0.22502947, 0.52067716],
[ 0.17744651, 0.42358254, 0.39897096],
[ 0.36179247, 0.28707039, 0.35113714]])
In [18]:
Generate NxN matrix with random values.
For every row:
Find sum of row S
S[j] = Sum(0..N-1){A[j, i]}
Then subtract (S-1)/N from every value in this row
A[j, i] = A[j, i] - (S[j] - 1) / N
If you need only non-negative values, generate non-negative randoms, and divide every value in row by sum of this row
A[j, i] = A[j, i] / S[j]
Here is some code:
import random
precision = 1000000
def f(n) :
matrix = []
for l in range(n) :
lineLst = []
sum = 0
crtPrec = precision
for i in range(n-1) :
val = random.randrange(crtPrec)
sum += val
lineLst.append(float(val)/precision)
crtPrec -= val
lineLst.append(float(precision - sum)/precision)
matrix.append(lineLst)
return matrix
matrix = f(5)
print matrix
I assumed the random numbers have to be positive, the sum of numbers on a raw has to be 1. I used a precision give in variable 'precision', if this is 1000 it means that the random numbers will have 3 digits after the comma. In y example 6 digits are used, you may use more.
Output:
[[0.086015, 0.596464, 0.161664, 0.03386, 0.121997],
[0.540478, 0.040961, 0.374275, 0.003793, 0.040493],
[0.046263, 0.249761, 0.460089, 0.006739, 0.237148],
[0.594743, 0.125554, 0.142809, 0.056124, 0.08077],
[0.746161, 0.151382, 0.068062, 0.005772, 0.028623]]
A right stochastic matrix is a real square matrix, with each row summing to 1.
Here's a sample you can create a function from, I leave that to you as homework
In [26]: import numpy as np
In [27]: N, M = 5, 5
In [28]: matrix = np.random.rand(N, M)
In [29]: matrix
Out[29]:
array([[ 0.27926909, 0.37026136, 0.35978443, 0.75216853, 0.53517512],
[ 0.93285517, 0.54825643, 0.43948394, 0.15134782, 0.31310007],
[ 0.91934362, 0.51707873, 0.3604323 , 0.78487053, 0.85757986],
[ 0.53595238, 0.80467646, 0.88001499, 0.4668259 , 0.63567632],
[ 0.83359167, 0.41603073, 0.21192656, 0.22650423, 0.95721952]])
In [30]: matrix = np.apply_along_axis(lambda x: x - (np.sum(x) - 1)/len(x), 1, matrix)
In [31]: matrix
Out[31]:
array([[ 0.01993739, 0.11092965, 0.10045272, 0.49283682, 0.27584341],
[ 0.65584649, 0.27124774, 0.16247526, -0.12566087, 0.03609139],
[ 0.43148261, 0.02921772, -0.12742871, 0.29700952, 0.36971886],
[ 0.07132317, 0.34004725, 0.41538578, 0.00219669, 0.17104711],
[ 0.50453713, 0.08697618, -0.11712798, -0.10255031, 0.62816498]])
Explanation
We create an N x M matrix
We then calculate the (sum - 1) / N
to be subtracted from each item row-wise
Then we apply it to each row of the matrix by using np.apply_along_axis()
with axis=1
to be applied on each row
Verify the result
Each row needs to sum up to 1
In [37]: matrix.sum(axis=1)
Out[37]: array([ 1., 1., 1., 1., 1.])
but how do I subtract that value from each entry in the row?
In my example I've used a lambda
that is equivalent to this function
def subtract_value(x):
return x - (np.sum(x) - 1)/len(x)
You can pass a function to apply_along_axis()
to be called on each element on the axis, in our case it's the rows
There are other ways too like numpy.vectorize() and numpy.frompyfunc
Making a function and apply it like any method from the above is better than looping through each item in each row, faster and less code, easier to read / understand the intent
One small point has been missed. A stochastic matrix is an M x N matrix of non-negative elements which rows sum to 1.0. MBo comment above states that:
If you need only non-negative values, generate non-negative randoms, and divide every value in row by sum of this row
A[j, i] = A[j, i] / S[j]
This is only true if the stored matrix is comprised entirely of whole numbers (not necessarily integers). Otherwise the resulting matrix may contain negative numbers, the larger the matrix, the more the negative elements.
This can be accomplished using:
X[i, j] = Math.Abs(random.Next(100, 900));
来源:https://stackoverflow.com/questions/31364744/creating-a-matrix-of-arbitrary-size-where-rows-sum-to-1