What is the difference between numpy.linalg.lstsq and scipy.linalg.lstsq?

前端 未结 2 1972
深忆病人
深忆病人 2021-02-01 15:38

lstsq tries to solve Ax=b minimizing |b - Ax|. Both scipy and numpy provide a linalg.lstsq function with a very similar inter

相关标签:
2条回答
  • 2021-02-01 15:52

    Numpy 1.13 - June 2017

    As of Numpy 1.13 and Scipy 0.19, both scipy.linalg.lstsq() and numpy.linalg.lstsq() call by default the same LAPACK code DSGELD (see LAPACK documentation).

    However, a current important difference between the two function is in the adopted default RCOND LAPACK parameter (called rcond by Numpy and cond by Scipy), which defines the threshold for singular values.

    Scipy uses a good and robust default threshold RCOND=eps*max(A.shape)*S[0], where S[0] is the largest singular value of A, while Numpy uses a default threshold RCOND=-1, which corresponds to setting in LAPACK the threshold equal to the machine precision, regardless of the values of A.

    Numpy's default approach is basically useless in realistic applications and will generally result in a very degenerate solution when A is nearly rank deficient, wasting the accuracy of the singular value decomposition SVD used by DSGELD. This implies that in Numpy the optional parameter rcond should be always used.

    Update: Numpy 1.14 - January 2018

    I reported the incorrect default of rcond (see above Section) in numpy.linalg.lstsq() and the function now raises a FutureWarning in Numpy 1.14 (see Future Changes).

    The future behaviour will be identical both in scipy.linalg.lstsq() and in numpy.linalg.lstsq(). In other words, Scipy and Numpy will not only use the same LAPACK code, but also use the same defaults.

    To start using the proper (i.e. future) default in Numpy 1.14, one should call numpy.linalg.lstsq() with an explicit rcond=None.

    0 讨论(0)
  • 2021-02-01 16:10

    If I read the source code right (Numpy 1.8.2, Scipy 0.14.1 ), numpy.linalg.lstsq() uses the LAPACK routine xGELSD and scipy.linalg.lstsq() usesxGELSS.

    The LAPACK Manual Sec. 2.4 states

    The subroutine xGELSD is significantly faster than its older counterpart xGELSS, especially for large problems, but may require somewhat more workspace depending on the matrix dimensions.

    That means that Numpy is faster but uses more memory.

    Update August 2017:

    Scipy now uses xGELSD by default https://docs.scipy.org/doc/scipy/reference/generated/scipy.linalg.lstsq.html

    0 讨论(0)
提交回复
热议问题