Faster way of finding least-square solution for large matrix

橙三吉。 提交于 2019-12-24 17:43:29

问题


I want to find the least-square solution of a matrix and I am using the numpy linalg.lstsq function;

weights = np.linalg.lstsq(semivariance, prediction, rcond=None)

The dimension for my variables are;

semivariance is a float of size 5030 x 5030

prediction is a 1D array of length 5030

The problem I have is it takes approximately 80sec to return the value of weights and I have to repeat the calculation of weights about 10000 times so the computational time is just elevated.

Is there a faster way/pythonic function to do this?


回答1:


@Brenlla appears to be right, even if you perform least squares by solving using the Moore-Penrose pseudo inverse, it is significantly faster than np.linalg.lstsq:

import numpy as np
import time

semivariance=np.random.uniform(0,100,[5030,5030]).astype(np.float64)
prediction=np.random.uniform(0,100,[5030,1]).astype(np.float64)

start=time.time()
weights_lstsq = np.linalg.lstsq(semivariance, prediction, rcond=None)
print("Took: {}".format(time.time()-start))

>>> Took: 34.65818190574646

start=time.time()
weights_pseudo = np.linalg.solve(semivariance.T.dot(semivariance),semivariance.T.dot(prediction))
print("Took: {}".format(time.time()-start))

>>> Took: 2.0434153079986572

np.allclose(weights_lstsq[0],weights_pseudo)

>>> True

The above is not on your exact matrices but the concept on the samples likely transfers. np.linalg.lstsq performs an optimisation problem by minimizing || b - a x ||^2 to solve for x in ax=b. This is usually faster on extremely large matrices, hence why linear models are often solved using gradient decent in neural networks, but in this case the matrices just aren't large enough for the performance benefit.



来源:https://stackoverflow.com/questions/50288513/faster-way-of-finding-least-square-solution-for-large-matrix

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!