Multivariate normal density in Python?

后端 未结 10 1119
星月不相逢
星月不相逢 2021-01-30 19:42

Is there any python package that allows the efficient computation of the PDF (probability density function) of a multivariate normal distribution?

It doesn\'t seem to be

相关标签:
10条回答
  • 2021-01-30 20:21

    The multivariate normal is now available on SciPy 0.14.0.dev-16fc0af:

    from scipy.stats import multivariate_normal
    var = multivariate_normal(mean=[0,0], cov=[[1,0],[0,1]])
    var.pdf([1,0])
    
    0 讨论(0)
  • 2021-01-30 20:22

    Here I elaborate a bit more on how exactly to use the multivariate_normal() from the scipy package:

    # Import packages
    import numpy as np
    from scipy.stats import multivariate_normal
    
    # Prepare your data
    x = np.linspace(-10, 10, 500)
    y = np.linspace(-10, 10, 500)
    X, Y = np.meshgrid(x,y)
    
    # Get the multivariate normal distribution
    mu_x = np.mean(x)
    sigma_x = np.std(x)
    mu_y = np.mean(y)
    sigma_y = np.std(y)
    rv = multivariate_normal([mu_x, mu_y], [[sigma_x, 0], [0, sigma_y]])
    
    # Get the probability density
    pos = np.empty(X.shape + (2,))
    pos[:, :, 0] = X
    pos[:, :, 1] = Y
    pd = rv.pdf(pos)
    
    0 讨论(0)
  • 2021-01-30 20:26

    I use the following code which calculates the logpdf value, which is preferable for larger dimensions. It also works for scipy.sparse matrices.

    import numpy as np
    import math
    import scipy.sparse as sp
    import scipy.sparse.linalg as spln
    
    def lognormpdf(x,mu,S):
        """ Calculate gaussian probability density of x, when x ~ N(mu,sigma) """
        nx = len(S)
        norm_coeff = nx*math.log(2*math.pi)+np.linalg.slogdet(S)[1]
    
        err = x-mu
        if (sp.issparse(S)):
            numerator = spln.spsolve(S, err).T.dot(err)
        else:
            numerator = np.linalg.solve(S, err).T.dot(err)
    
        return -0.5*(norm_coeff+numerator)
    

    Code is from pyParticleEst, if you want the pdf value instead of the logpdf just take math.exp() on the returned value

    0 讨论(0)
  • 2021-01-30 20:26

    You can easily compute using numpy. I have implemented as below for the purpose of machine learning course and would like to share, hope it helps to someone.

    import numpy as np
    X = np.array([[13.04681517, 14.74115241],[13.40852019, 13.7632696 ],[14.19591481, 15.85318113],[14.91470077, 16.17425987]])
    
    def est_gaus_par(X):
        mu = np.mean(X,axis=0)
        sig = np.std(X,axis=0)
        return mu,sig
    
    mu,sigma = est_gaus_par(X)
    
    def est_mult_gaus(X,mu,sigma):
        m = len(mu)
        sigma2 = np.diag(sigma)
        X = X-mu.T
        p = 1/((2*np.pi)**(m/2)*np.linalg.det(sigma2)**(0.5))*np.exp(-0.5*np.sum(X.dot(np.linalg.pinv(sigma2))*X,axis=1))
    
        return p
    
    p = est_mult_gaus(X, mu, sigma)
    
    0 讨论(0)
  • 2021-01-30 20:28

    In the common case of a diagonal covariance matrix, the multivariate PDF can be obtained by simply multiplying the univariate PDF values returned by a scipy.stats.norm instance. If you need the general case, you will probably have to code this yourself (which shouldn't be hard).

    0 讨论(0)
  • 2021-01-30 20:29

    The density can be computed in a pretty straightforward way using numpy functions and the formula on this page: http://en.wikipedia.org/wiki/Multivariate_normal_distribution. You may also want to use the likelihood function (log probability), which is less likely to underflow for large dimensions and is a little more straightforward to compute. Both just involve being able to compute the determinant and inverse of a matrix.

    The CDF, on the other hand, is an entirely different animal...

    0 讨论(0)
提交回复
热议问题