Given a posterior p(Θ|D) over some parameters Θ, one can define the following:
The Highest Posterior Density Region
Another option (adapted from R to Python) and taken from the book Doing bayesian data analysis by John K. Kruschke) is the following:
from scipy.optimize import fmin
from scipy.stats import *
def HDIofICDF(dist_name, credMass=0.95, **args):
# freeze distribution with given arguments
distri = dist_name(**args)
# initial guess for HDIlowTailPr
incredMass = 1.0 - credMass
def intervalWidth(lowTailPr):
return distri.ppf(credMass + lowTailPr) - distri.ppf(lowTailPr)
# find lowTailPr that minimizes intervalWidth
HDIlowTailPr = fmin(intervalWidth, incredMass, ftol=1e-8, disp=False)[0]
# return interval as array([low, high])
return distri.ppf([HDIlowTailPr, credMass + HDIlowTailPr])
The idea is to create a function intervalWidth that returns the width of the interval that starts at lowTailPr and has credMass mass. The minimum of the intervalWidth function is founded by using the fmin minimizer from scipy.
For example the result of:
print HDIofICDF(norm, credMass=0.95, loc=0, scale=1)
is
[-1.95996398 1.95996398]
The name of the distribution parameters passed to HDIofICDF, must be exactly the same used in scipy.