I am trying to generate random points on the surface of the sphere using numpy. I have reviewed the post that explains uniform distribution here. However, need ideas on how
Points on the surface of a sphere can be expressed using two spherical coordinates, theta
and phi
, with 0 < theta < 2pi
and 0 < phi < pi
.
Conversion formula into cartesian x, y, z
coordinates:
x = r * cos(theta) * sin(phi)
y = r * sin(theta) * sin(phi)
z = r * cos(phi)
where r
is the radius of the sphere.
So the program could randomly sample theta
and phi
in their ranges, at uniform distribution, and generate the cartesian coordinates from it.
But then the points get distributed more densley on the poles of the sphere. In order for points to get uniformly distributed on the sphere surface, phi
needs to be chosen as phi = acos(a)
where -1 < a < 1
is chosen on an uniform distribution.
For the Numpy code it would be the same as in Sampling uniformly distributed random points inside a spherical volume , except that the variable radius
has a fixed value.
Based on the last approach on this page, you can simply generate a vector consisting of independent samples from three standard normal distributions, then normalize the vector such that its magnitude is 1:
import numpy as np
def sample_spherical(npoints, ndim=3):
vec = np.random.randn(ndim, npoints)
vec /= np.linalg.norm(vec, axis=0)
return vec
For example:
from matplotlib import pyplot as plt
from mpl_toolkits.mplot3d import axes3d
phi = np.linspace(0, np.pi, 20)
theta = np.linspace(0, 2 * np.pi, 40)
x = np.outer(np.sin(theta), np.cos(phi))
y = np.outer(np.sin(theta), np.sin(phi))
z = np.outer(np.cos(theta), np.ones_like(phi))
xi, yi, zi = sample_spherical(100)
fig, ax = plt.subplots(1, 1, subplot_kw={'projection':'3d', 'aspect':'equal'})
ax.plot_wireframe(x, y, z, color='k', rstride=1, cstride=1)
ax.scatter(xi, yi, zi, s=100, c='r', zorder=10)
The same method also generalizes to picking uniformly distributed points on the unit circle (ndim=2
) or on the surfaces of higher-dimensional unit hyperspheres.
Another way that depending on the hardware could be much faster.
Choose a, b, c
to be three random numbers each between -1 and 1
Calculate r2 = a^2 + b^2 + c^2
If r2 > 1.0 (=the point isn't in the sphere) or r2 < 0.00001 (=the point is too close to the center, we'll have division by zero while projecting to the surface of the sphere) you discard the values, and pick another set of random a, b, c
Otherwise, you’ve got your random point (relative to center of the sphere):
ir = R / sqrt(r2)
x = a * ir
y = b * ir
z = c * ir
(edited to reflect corrections from comments)
i investigated a few constant time approaches to this problem in 2004.
assuming you're working in spherical coordinates where theta
is the angle around the vertical axis (eg longitude) and phi
is the angle raised up from the equator (eg latitude),
then to obtain a uniform distribution of random points on the hemisphere north of the equator you do this:
theta
= rand(0, 360).phi
= 90 * (1 - sqrt(rand(0, 1))).to get points on a sphere instead of a hemisphere, then simply negate phi
50% of the time.
for the curious, a similar approach holds for generating uniformly-distributed points on a unit-disk:
theta
= rand(0, 360).radius
= sqrt(rand(0, 1)).i do not have proofs for the correctness of these approaches, but i've used them with lots of success over the past decade or so, and am convinced of their correctness.
some illustration (from 2004) of the various approaches is here, including a visualization of the approach of choosing points on the surface of a cube and normalizing them onto the sphere.
Following some discussion with @Soonts I got curious about the performance of the three approaches used in the answers: one with generating random angles, one using normally distributed coordinates, and one rejecting uniformly distributed points.
Here's my attempted comparison:
import numpy as np
def sample_trig(npoints):
theta = 2*np.pi*np.random.rand(npoints)
phi = np.arccos(2*np.random.rand(npoints)-1)
x = np.cos(theta) * np.sin(phi)
y = np.sin(theta) * np.sin(phi)
z = np.cos(phi)
return np.array([x,y,z])
def sample_normals(npoints):
vec = np.random.randn(3, npoints)
vec /= np.linalg.norm(vec, axis=0)
return vec
def sample_reject(npoints):
vec = np.zeros((3,npoints))
abc = 2*np.random.rand(3,npoints)-1
norms = np.linalg.norm(abc,axis=0)
mymask = norms<=1
abc = abc[:,mymask]/norms[mymask]
k = abc.shape[1]
vec[:,0:k] = abc
while k<npoints:
abc = 2*np.random.rand(3)-1
norm = np.linalg.norm(abc)
if 1e-5 <= norm <= 1:
vec[:,k] = abc/norm
k = k+1
return vec
Then for 1000 points
In [449]: timeit sample_trig(1000)
1000 loops, best of 3: 236 µs per loop
In [450]: timeit sample_normals(1000)
10000 loops, best of 3: 172 µs per loop
In [451]: timeit sample_reject(1000)
100 loops, best of 3: 13.7 ms per loop
Note that in the rejection-based implementation I first generated npoints
samples and threw away the bad ones, and I only used a loop to generate the rest of the points. It seemed to be the case that the direct step-by-step rejection takes a longer amount of time. I also removed the check for division-by-zero to have a cleaner comparison with the sample_normals
case.
Removing vectorization from the two direct methods puts them into the same ballpark:
def sample_trig_loop(npoints):
x = np.zeros(npoints)
y = np.zeros(npoints)
z = np.zeros(npoints)
for k in range(npoints):
theta = 2*np.pi*np.random.rand()
phi = np.arccos(2*np.random.rand()-1)
x[k] = np.cos(theta) * np.sin(phi)
y[k] = np.sin(theta) * np.sin(phi)
z[k] = np.cos(phi)
return np.array([x,y,z])
def sample_normals_loop(npoints):
vec = np.zeros((3,npoints))
for k in range(npoints):
tvec = np.random.randn(3)
vec[:,k] = tvec/np.linalg.norm(tvec)
return vec
In [464]: timeit sample_trig(1000)
1000 loops, best of 3: 236 µs per loop
In [465]: timeit sample_normals(1000)
10000 loops, best of 3: 173 µs per loop
In [466]: timeit sample_reject(1000)
100 loops, best of 3: 14 ms per loop
In [467]: timeit sample_trig_loop(1000)
100 loops, best of 3: 7.92 ms per loop
In [468]: timeit sample_normals_loop(1000)
100 loops, best of 3: 10.9 ms per loop