Python difference between randn and normal

前端未结

关注

 3  900

I\'m using the randn and normal functions from Python\'s numpy.random module. The functions are pretty similar from what I\'ve read in the

相关标签:

3条回答

傲寒

2021-01-31 16:04
I'm a statistician who sometimes codes, not vice-versa, so this is something I can answer with some accuracy.

Looking at the docs that you linked in your question, I'll highlight some of the key differences:

normal:
```
numpy.random.normal(loc=0.0, scale=1.0, size=None)
# Draw random samples from a normal (Gaussian) distribution.

# Parameters :  
# loc : float -- Mean (“centre”) of the distribution.
# scale : float -- Standard deviation (spread or “width”) of the distribution.
# size : tuple of ints -- Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn.
```
So in this case, you're generating a GENERIC normal distribution (more details on what that means later).

randn:
```
numpy.random.randn(d0, d1, ..., dn)
# Return a sample (or samples) from the “standard normal” distribution.

# Parameters :  
# d0, d1, ..., dn : int, optional -- The dimensions of the returned array, should be all positive. If no argument is given a single Python float is returned.
# Returns : 
# Z : ndarray or float -- A (d0, d1, ..., dn)-shaped array of floating-point samples from the standard normal distribution, or a single such float if no parameters were supplied.
```
In this case, you're generating a SPECIFIC normal distribution, the standard distribution.

Now some of the math, which is really needed to get at the heart of your question:

A normal distribution is a distribution where the values are more likely to occur near the mean value. There are a bunch of cases of this in nature. E.g., the average high temperature in Dallas in June is, let's say, 95 F. It might reach 100, or even 105 average in one year, but it more typically will be near 95 or 97. Similarly, it might reach as low as 80, but 85 or 90 is more likely.

So, it is fundamentally different from, say, a uniform distribution (rolling an honest 6-sided die).

A standard normal distribution is just a normal distribution where the average value is 0, and the variance (the mathematical term for the variation) is 1.

So,
```
numpy.random.normal(size= (10, 10))
```
is the exact same thing as writing
```
numpy.random.randn(10, 10)
```
because the default values (loc= 0, scale= 1) for numpy.random.normal are in fact the standard distribution.

To make matters more confusing, as the numpy random documentation states:
```
sigma * np.random.randn(...) + mu
```
is the same as
```
np.random.normal(loc= mu, scale= sigma, ...)
```
*Final note: I used the term variance to mathematically describe variation. Some folks say standard deviation. Variance simply equals the square of standard deviation. Since the variance = 1 for the standard distribution, in this case of the standard distribution, variance == standard deviation.
0 讨论(0)
发布评论:

提交评论
- 加载中...
孤独总比滥情好

2021-01-31 16:22

randn seems to give a distribution from some standardized normal distribution (mean 0 and variance 1). normal takes more parameters for more control. So rand seems to simply be a convenience function

0 讨论(0)
发布评论:

提交评论
- 加载中...
走了就别回头了

2021-01-31 16:29
Following up to @Mike Williamson's explanation about variance, standard deviation, I was caught trying to workout the example provided in the Numpy documentation for randn The example provided there:
```
>>> import numpy as np
>>> 2.5 * np.random.randn(2, 4) + 3
array([[-1.13788245,  2.54061141, -0.12769502,  7.46200906],
       [-0.4780766 ,  1.70417835,  5.43802441,  4.71764135]])
```
The point to note here is that Normal Distribution follows notation N(Mean, Variance), whereas to implement using .randn() you would require to multiply the standard deviation or sigma and add the Mean or mu to the Standard Normal Output of the Numpy method(s).

Note:

sqrt(Variance) = Standard Deviation or sigma

Eg.,

sqrt(6.25) = 2.5

Hence:

sigma * numpy.random.randn(2, 4) + mean
0 讨论(0)
发布评论:

提交评论
- 加载中...