Numpy array with different standard deviation per row

问题

I'd like to get an NxM matrix where numbers in each row are random samples generated from different normal distributions(same mean but different standard deviations). The following code works:

import numpy as np

mean = 0.0 # same mean
stds = [1.0, 2.0, 3.0] # different stds
matrix = np.random.random((3,10))

for i,std in enumerate(stds):
     matrix[i] = np.random.normal(mean, std, matrix.shape[1])

However, this code is not quite efficient as there is a for loop involved. Is there a faster way to do this?

回答1:

np.random.normal() is vectorized; you can switch axes and transpose the result:

np.random.seed(444)
arr = np.random.normal(loc=0., scale=[1., 2., 3.], size=(1000, 3)).T

print(arr.mean(axis=1))
# [-0.06678394 -0.12606733 -0.04992722]
print(arr.std(axis=1))
# [0.99080274 2.03563299 3.01426507]

That is, the scale parameter is the column-wise standard deviation, hence the need to transpose via .T since you want row-wise inputs.

回答2:

How about this?

rows = 10000
stds = [1, 5, 10]

data = np.random.normal(size=(rows, len(stds)))
scaled = data * stds

print(np.std(scaled, axis=0))

Output:

[ 0.99417905  5.00908719 10.02930637]

This exploits the fact that a two normal distributions can be interconverted by linear scaling (in this case, multiplying by standard deviation). In the output, each column (second axis) will contain a normally distributed variable corresponding to a value in stds.

来源：https://stackoverflow.com/questions/55788467/numpy-array-with-different-standard-deviation-per-row

标签

python

numpy

vectorization

gaussian

normal-distribution