Speed up Metropolis--Hastings in Python

前端 未结 2 1297
半阙折子戏
半阙折子戏 2021-02-10 01:07

I have some code that samples a posterior distribution using MCMC, specifically Metropolis Hastings. I use scipy to generate random samples:

import numpy as np
f         


        
2条回答
  •  囚心锁ツ
    2021-02-10 01:33

    Unluckily I really don't see any possibility to speed up the random distributions, except for rewriting them yourself in numba compatible python code.

    But one easy possibility to speed up the bottleneck of your code, is to replace the two calls to the stats functions two one call with:

    p1, p2 = (
        stats.beta(a=2, b=5).pdf([x_prime, x_t])
        * stats.norm(loc=0, scale=2).pdf([x_prime, x_t]))
    

    Another slight tweak may be outsourcing the generation of u outside of the for loop with:

    x_t = stats.uniform(0, 1).rvs() # initial value
    posterior = np.zeros((n,))
    u = stats.uniform(0, 1).rvs(size=n) # random uniform
    for t in range(n):  # and so on
    

    And then indexing u within the loop (of course the line u = stats.uniform(0,1).rvs() # random uniform in the loop has to be deleted):

    if u[t] <= alpha:
        x_t = x_prime # accept
        posterior[t] = x_t
    elif u[t] > alpha:
        x_t = x_t # reject
    

    Minor changes may also be simplifying the if condition by omitting the elif statement or if required for other purposes by replacing it with else. But this is really just a tiny improvement:

    if u[t] <= alpha:
        x_t = x_prime # accept
        posterior[t] = x_t
    

    Edit

    Another improvement based on jwalton's answer:

    def new_get_samples(n):
        """
        Generate and return a randomly sampled posterior.
    
        For simplicity, Prior is fixed as Beta(a=2,b=5), Likelihood is fixed as Normal(0,2)
    
        :type n: int
        :param n: number of iterations
    
        :rtype: numpy.ndarray
        """
    
        x_cur = np.random.uniform()
        innov = norm.rvs(size=n)
        x_prop = x_cur + innov
        u = np.random.uniform(size=n)
    
        post_cur = beta.pdf(x_cur, a=2,b=5) * norm.pdf(x_cur, loc=0,scale=2)
        post_prop = beta.pdf(x_prop, a=2,b=5) * norm.pdf(x_prop, loc=0,scale=2)
    
        posterior = np.zeros((n,))
        for t in range(n):        
            alpha = post_prop[t] / post_cur
            if u[t] <= alpha:
                x_cur = x_prop[t]
                post_cur = post_prop[t]
            posterior[t] = x_cur
        return posterior
    

    With the improved timings of:

    %timeit my_get_samples(1000)
    # 187 ms ± 13 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
    %timeit my_get_samples2(1000)
    # 1.55 ms ± 57.3 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
    

    This is a speed-up of a factor of 121 over jwalton's answer. It is accomplished by outsourcing post_prop calculation.

    Checking the histogram, this seems to be ok. But better ask jwalton if it really is ok, he seems to have much more understanding of the topic. :)

提交回复
热议问题