How can I obtain the same 'special' solutions to underdetermined linear systems that Matlab's `A \ b` (mldivide) operator returns using numpy/scipy?

后端 未结 2 2089
悲哀的现实
悲哀的现实 2020-12-09 22:50

I found a link where it is shown with an example that the Matlab mldivide operator (\\) gives \'special\' solutions when the system of linear equat

相关标签:
2条回答
  • 2020-12-09 23:05

    Non-negative least squares (scipy.optimize.nnls) is not a general solution to this problem. A trivial case where it will fail is if all of the possible solutions contain negative coefficients:

    import numpy as np
    from scipy.optimize import nnls
    
    A = np.array([[1, 2, 0],
                  [0, 4, 3]])
    b = np.array([-1, -2])
    
    print(nnls(A, b))
    # (array([ 0.,  0.,  0.]), 2.23606797749979)
    

    In the case where A·x = b is underdetermined,

    x1, res, rnk, s = np.linalg.lstsq(A, b)
    

    will pick a solution x' that minimizes ||x||L2 subject to ||A·x - b||L2 = 0. This happens not to be the particular solution we are looking for, but we can linearly transform it to get what we want. In order to do that, we'll first compute the right null space of A, which characterizes the space of all possible solutions to A·x = b. We can get this using a rank-revealing QR decomposition:

    from scipy.linalg import qr
    
    def qr_null(A, tol=None):
        Q, R, P = qr(A.T, mode='full', pivoting=True)
        tol = np.finfo(R.dtype).eps if tol is None else tol
        rnk = min(A.shape) - np.abs(np.diag(R))[::-1].searchsorted(tol)
        return Q[:, rnk:].conj()
    
    Z = qr_null(A)
    

    Z is a vector (or, in case where n - rnk(A) > 1, a set of basis vectors spanning a subspace of A) such that A·Z = 0:

    print(A.dot(Z))
    # [[  0.00000000e+00]
    #  [  8.88178420e-16]]
    

    In other words, the column(s) of Z are vectors that are orthogonal to all of the rows in A. This means that for any solution x' to A·x = b, then x' = x + Z·c must also be a solution for any arbitrary scaling factor c. This means that by picking an appropriate value of c, we can set any n - rnk(A) of the coefficients in the solution to zero.

    For example, let's say we wanted to set the value of the last coefficient to zero:

    c = -x1[-1] / Z[-1, 0]
    x2 = x1 + Z * c
    print(x2)
    # [ -8.32667268e-17  -5.00000000e-01   0.00000000e+00]
    print(A.dot(x2))
    # [-1. -2.]
    

    The more general case where n - rnk(A) ≤ 1 is a little bit more complicated:

    A = np.array([[1, 4, 9, 6, 9, 2, 7],
                  [6, 3, 8, 5, 2, 7, 6],
                  [7, 4, 5, 7, 6, 3, 2],
                  [5, 2, 7, 4, 7, 5, 4],
                  [9, 3, 8, 6, 7, 3, 1]])
    x_exact = np.array([ 1,  2, -1, -2,  5,  0,  0])
    b = A.dot(x_exact)
    print(b)
    # [33,  4, 26, 29, 30]
    

    We get x' and Z as before:

    x1, res, rnk, s = np.linalg.lstsq(A, b)
    Z = qr_null(A)
    

    Now in order to maximise the number of zero-valued coefficients in the solution vector, we want to find a vector C such that

    x' = x + Z·C = [x'0, x'1, ..., x'rnk(A)-1, 0, ..., 0]T

    If the last n - rnk(A) coefficients in x' are to be zeros, this imposes that

    Z{rnk(A),...,n}·C = -x{rnk(A),...,n}

    We can thus solve for C (exactly, since we know that Z[rnk:] must be full-rank):

    C = np.linalg.solve(Z[rnk:], -x1[rnk:])
    

    and compute x' :

    x2 = x1 + Z.dot(C)
    print(x2)
    # [  1.00000000e+00   2.00000000e+00  -1.00000000e+00  -2.00000000e+00
    #    5.00000000e+00   5.55111512e-17   0.00000000e+00]
    print(A.dot(x2))
    # [ 33.   4.  26.  29.  30.]
    

    To put it all together into a single function:

    import numpy as np
    from scipy.linalg import qr
    
    def solve_minnonzero(A, b):
        x1, res, rnk, s = np.linalg.lstsq(A, b)
        if rnk == A.shape[1]:
            return x1   # nothing more to do if A is full-rank
        Q, R, P = qr(A.T, mode='full', pivoting=True)
        Z = Q[:, rnk:].conj()
        C = np.linalg.solve(Z[rnk:], -x1[rnk:])
        return x1 + Z.dot(C)
    
    0 讨论(0)
  • 2020-12-09 23:06
    np.array([[8],[18]]).shape
    

    is

    (2,1) 
    

    but you want

    (2,)
    
    #!/usr/bin/env python3
    import numpy as np
    from scipy.optimize import nnls
    test_A = np.array([[1,2,0],[0,4,3]])
    
    try:
        test_b = np.array([[8],[18]]) # wrong
        print(nnls(test_A,test_b))
    except Exception as e:
        print(str(e))
    
    test_b = np.array([8,18]) # sic!
    print(nnls(test_A,test_b))
    

    output:

    expected vector
    (array([ 0.        ,  4.        ,  0.66666667]), 0.0)
    
    0 讨论(0)
提交回复
热议问题