问题
I want to find the least-square solution of a matrix and I am using the numpy linalg.lstsq
function;
weights = np.linalg.lstsq(semivariance, prediction, rcond=None)
The dimension for my variables are;
semivariance
is a float of size 5030 x 5030
prediction
is a 1D array of length 5030
The problem I have is it takes approximately 80sec to return the value of weights
and I have to repeat the calculation of weights
about 10000 times so the computational time is just elevated.
Is there a faster way/pythonic function to do this?
回答1:
@Brenlla appears to be right, even if you perform least squares by solving using the Moore-Penrose pseudo inverse, it is significantly faster than np.linalg.lstsq
:
import numpy as np
import time
semivariance=np.random.uniform(0,100,[5030,5030]).astype(np.float64)
prediction=np.random.uniform(0,100,[5030,1]).astype(np.float64)
start=time.time()
weights_lstsq = np.linalg.lstsq(semivariance, prediction, rcond=None)
print("Took: {}".format(time.time()-start))
>>> Took: 34.65818190574646
start=time.time()
weights_pseudo = np.linalg.solve(semivariance.T.dot(semivariance),semivariance.T.dot(prediction))
print("Took: {}".format(time.time()-start))
>>> Took: 2.0434153079986572
np.allclose(weights_lstsq[0],weights_pseudo)
>>> True
The above is not on your exact matrices but the concept on the samples likely transfers. np.linalg.lstsq
performs an optimisation problem by minimizing || b - a x ||^2
to solve for x
in ax=b
. This is usually faster on extremely large matrices, hence why linear models are often solved using gradient decent in neural networks, but in this case the matrices just aren't large enough for the performance benefit.
来源:https://stackoverflow.com/questions/50288513/faster-way-of-finding-least-square-solution-for-large-matrix