How to compute scipy sparse matrix determinant without turning it to dense?

前端 未结 3 2032
挽巷
挽巷 2021-02-01 17:19

I am trying to figure out the fastest method to find the determinant of sparse symmetric and real matrices in python. using scipy sparse module but really surprised

相关标签:
3条回答
  • 2021-02-01 18:07

    The "standard" way to solve this problem is with a cholesky decomposition, but if you're not up to using any new compiled code, then you're out of luck. The best sparse cholesky implementation is Tim Davis's CHOLMOD, which is licensed under the LGPL and thus not available in scipy proper (scipy is BSD).

    0 讨论(0)
  • 2021-02-01 18:24

    Here are some references I provided as part of an answer here. I think they address the actual problem you are trying to solve:

    • notes for an implementation in the Shogun library
    • Erlend Aune, Daniel P. Simpson: Parameter estimation in high dimensional Gaussian distributions, particularly section 2.1 (arxiv:1105.5256)
    • Ilse C.F. Ipsen, Dean J. Lee: Determinant Approximations (arxiv:1105.0437)
    • Arnold Reusken: Approximation of the determinant of large sparse symmetric positive definite matrices (arxiv:hep-lat/0008007)

    Quoting from the Shogun notes:

    The usual technique for computing the log-determinant term in the likelihood expression relies on Cholesky factorization of the matrix, i.e. Σ=LLT, (L is the lower triangular Cholesky factor) and then using the diagonal entries of the factor to compute log(det(Σ))=2∑ni=1log(Lii). However, for sparse matrices, as covariance matrices usually are, the Cholesky factors often suffer from fill-in phenomena - they turn out to be not so sparse themselves. Therefore, for large dimensions this technique becomes infeasible because of a massive memory requirement for storing all these irrelevant non-diagonal co-efficients of the factor. While ordering techniques have been developed to permute the rows and columns beforehand in order to reduce fill-in, e.g. approximate minimum degree (AMD) reordering, these techniques depend largely on the sparsity pattern and therefore not guaranteed to give better result.

    Recent research shows that using a number of techniques from complex analysis, numerical linear algebra and greedy graph coloring, we can, however, approximate the log-determinant up to an arbitrary precision [Aune et. al., 2012]. The main trick lies within the observation that we can write log(det(Σ)) as trace(log(Σ)), where log(Σ) is the matrix-logarithm.

    0 讨论(0)
  • 2021-02-01 18:24

    You can use scipy.sparse.linalg.splu to obtain sparse matrices for the lower (L) and upper (U) triangular matrices of an M=LU decomposition:

    from scipy.sparse.linalg import splu
    
    lu = splu(M)
    

    The determinant det(M) can be then represented as:

    det(M) = det(LU) = det(L)det(U)
    

    The determinant of triangular matrices is just the product of the diagonal terms:

    diagL = lu.L.diagonal()
    diagU = lu.U.diagonal()
    d = diagL.prod()*diagU.prod()
    

    However, for large matrices underflow or overflow commonly occurs, which can be avoided by working with the logarithms.

    diagL = diagL.astype(np.complex128)
    diagU = diagU.astype(np.complex128)
    logdet = np.log(diagL).sum() + np.log(diagU).sum()
    

    Note that I invoke complex arithmetic to account for negative numbers that might appear in the diagonals. Now, from logdet you can recover the determinant:

    det = np.exp(logdet) # usually underflows/overflows for large matrices
    

    whereas the sign of the determinant can be calculated directly from diagL and diagU (important for example when implementing Crisfield's arc-length method):

    sign = swap_sign*np.sign(diagL).prod()*np.sign(diagU).prod()
    

    where swap_sign is a term to consider the number of permutations in the LU decomposition. Thanks to @Luiz Felippe Rodrigues, it can be calculated:

    swap_sign = minimumSwaps(lu.perm_r)
    
    def minimumSwaps(arr): 
        """
        Minimum number of swaps needed to order a
        permutation array
        """
        # from https://www.thepoorcoder.com/hackerrank-minimum-swaps-2-solution/
        a = dict(enumerate(arr))
        b = {v:k for k,v in a.items()}
        count = 0
        for i in a:
            x = a[i]
            if x!=i:
                y = b[i]
                a[y] = x
                b[x] = y
                count+=1
        return count
    
    0 讨论(0)
提交回复
热议问题