I have some code that samples a posterior distribution using MCMC, specifically Metropolis Hastings. I use scipy to generate random samples:
import numpy as np
f
Unluckily I really don't see any possibility to speed up the random distributions, except for rewriting them yourself in numba compatible python code.
But one easy possibility to speed up the bottleneck of your code, is to replace the two calls to the stats functions two one call with:
p1, p2 = (
stats.beta(a=2, b=5).pdf([x_prime, x_t])
* stats.norm(loc=0, scale=2).pdf([x_prime, x_t]))
Another slight tweak may be outsourcing the generation of u
outside of the for loop with:
x_t = stats.uniform(0, 1).rvs() # initial value
posterior = np.zeros((n,))
u = stats.uniform(0, 1).rvs(size=n) # random uniform
for t in range(n): # and so on
And then indexing u
within the loop (of course the line u = stats.uniform(0,1).rvs() # random uniform
in the loop has to be deleted):
if u[t] <= alpha:
x_t = x_prime # accept
posterior[t] = x_t
elif u[t] > alpha:
x_t = x_t # reject
Minor changes may also be simplifying the if condition by omitting the elif
statement or if required for other purposes by replacing it with else
. But this is really just a tiny improvement:
if u[t] <= alpha:
x_t = x_prime # accept
posterior[t] = x_t
Another improvement based on jwalton's answer:
def new_get_samples(n):
"""
Generate and return a randomly sampled posterior.
For simplicity, Prior is fixed as Beta(a=2,b=5), Likelihood is fixed as Normal(0,2)
:type n: int
:param n: number of iterations
:rtype: numpy.ndarray
"""
x_cur = np.random.uniform()
innov = norm.rvs(size=n)
x_prop = x_cur + innov
u = np.random.uniform(size=n)
post_cur = beta.pdf(x_cur, a=2,b=5) * norm.pdf(x_cur, loc=0,scale=2)
post_prop = beta.pdf(x_prop, a=2,b=5) * norm.pdf(x_prop, loc=0,scale=2)
posterior = np.zeros((n,))
for t in range(n):
alpha = post_prop[t] / post_cur
if u[t] <= alpha:
x_cur = x_prop[t]
post_cur = post_prop[t]
posterior[t] = x_cur
return posterior
With the improved timings of:
%timeit my_get_samples(1000)
# 187 ms ± 13 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
%timeit my_get_samples2(1000)
# 1.55 ms ± 57.3 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
This is a speed-up of a factor of 121 over jwalton's answer. It is accomplished by outsourcing post_prop
calculation.
Checking the histogram, this seems to be ok. But better ask jwalton if it really is ok, he seems to have much more understanding of the topic. :)