问题
For some reason I cannot get this block of code to run properly anymore:
import numpy as np
from sklearn.linear_model import LinearRegression
# Create linear data with some noise
x = np.random.uniform(0, 100, 1000)
y = 2. * x + 3. + np.random.normal(0, 10, len(x))
# Fit linear data with sklearn LinearRegression
lm = LinearRegression()
lm.fit(x.reshape(-1, 1), y)
Traceback (most recent call last):
File "<input>", line 2, in <module>
File "C:\Python37\lib\site-packages\sklearn\linear_model\_base.py", line 547, in fit
linalg.lstsq(X, y)
File "C:\Python37\lib\site-packages\scipy\linalg\basic.py", line 1224, in lstsq
% (-info, lapack_driver))
ValueError: illegal value in 4-th argument of internal None
I'm not sure why I'm getting this error on such a simple example. Here are my current versions:
scipy.__version__
'1.5.0'
sklearn.__version__
'0.23.1'
I'm running this on 64-bit Windows 10 Enterprise and Python 3.7.3. I've tried uninstalling and reinstalling scipy and scikit-learn. I've tried earlier version of scipy. I've tried uninstalling and reinstalling Python and none of these solved the issue.
Update: So it appears to be tied to matplotlib too. I was running this example previously in Pycharm, but I've moved to running it directly from the PowerShell. So if I run this code outside of Pycharm I do not get an error
import numpy as np
from sklearn.linear_model import LinearRegression
import matplotlib.pyplot as plt
# Create linear data with some noise
x = np.random.uniform(0, 100, 1000)
y = 2. * x + 3. + np.random.normal(0, 10, len(x))
# Fit linear data with sklearn LinearRegression
lm = LinearRegression()
lm.fit(x.reshape(-1, 1), y)
However if I plot the data during it I get an error:
import numpy as np
from sklearn.linear_model import LinearRegression
import matplotlib.pyplot as plt
# Create linear data with some noise
x = np.random.uniform(0, 100, 1000)
y = 2. * x + 3. + np.random.normal(0, 10, len(x))
# Plot data
plt.scatter(x, y)
plt.plot(np.linspace(0, 100, 10), 2. * np.linspace(0, 100, 10) + 3., ls='--', c='red')
# Fit linear data with sklearn LinearRegression
lm = LinearRegression()
lm.fit(x.reshape(-1, 1), y)
** On entry to DLASCLS parameter number 4 had an illegal value
Traceback (most recent call last):
File ".\run.py", line 18, in <module>
lm.fit(x.reshape(-1, 1), y)
File "C:\Python37\lib\site-packages\sklearn\linear_model\_base.py", line 547, in fit
linalg.lstsq(X, y)
File "C:\Python37\lib\site-packages\scipy\linalg\basic.py", line 1224, in lstsq
% (-info, lapack_driver))
ValueError: illegal value in 4-th argument of internal None
But if I comment out the line plt.plot(np.linspace(0, 100, 10), 2. * np.linspace(0, 100, 10) + 3., ls='--', c='red')
it works fine.
回答1:
It seems it only happens when you print the figure using matplotlib, else you can run the fit algorithm as many times as you like.
However if you change the data type from float64 to float32 (Grzesik answer), strangely enough the error disappears. Feels like a bug to me Why would changing the data type affect the interaction between matplotlib and the lapack_function within sklearn?
More a question than an answer, but it is a bit scary to find these unexpected interactions across functions and data types.
import numpy as np
import sklearn
from sklearn.linear_model import LinearRegression
import matplotlib.pyplot as plt
def main(print_matplotlib=False,dtype=np.float64):
x = np.linspace(-3,3,100).astype(dtype)
print(x.dtype)
y = 2*np.random.rand(x.shape[0])*x + np.random.rand(x.shape[0])
x = x.reshape((-1,1))
reg=LinearRegression().fit(x,y)
print(reg.intercept_,reg.coef_)
yh = reg.predict(x)
if print_matplotlib:
plt.scatter(x,y)
plt.plot(x,yh)
plt.show()
No plotting
if __name__ == "__main__":
np.random.seed(64)
main(print_matplotlib = False, dtype=np.float64)
np.random.seed(64)
main(print_matplotlib = False, dtype=np.float64)
pass
float64
0.5957165420019624 [0.91960601]
float64
0.5957165420019624 [0.91960601]
Plotting dtype = np.float64
if __name__ == "__main__":
np.random.seed(64)
main(print_matplotlib = True, dtype=np.float64)
np.random.seed(64)
main(print_matplotlib = True, dtype=np.float64)
pass
float64
0.5957165420019624 [0.91960601]
Plot 1
float64
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-3-52593a548324> in <module>
3 main(print_matplotlib = True)
4 np.random.seed(64)
----> 5 main(print_matplotlib = True)
6
7 pass
<ipython-input-1-11139051f2d3> in main(print_matplotlib, dtype)
11 x = x.reshape((-1,1))
12
---> 13 reg=LinearRegression().fit(x,y)
14 print(reg.intercept_,reg.coef_)
15
~\Anaconda3\lib\site-packages\sklearn\linear_model\_base.py in fit(self, X, y, sample_weight)
545 else:
546 self.coef_, self._residues, self.rank_, self.singular_ = \
--> 547 linalg.lstsq(X, y)
548 self.coef_ = self.coef_.T
549
~\AppData\Roaming\Python\Python37\site-packages\scipy\linalg\basic.py in lstsq(a, b, cond, overwrite_a, overwrite_b, check_finite, lapack_driver)
1249 if info < 0:
1250 raise ValueError('illegal value in %d-th argument of internal %s'
-> 1251 % (-info, lapack_driver))
1252 resids = np.asarray([], dtype=x.dtype)
1253 if m > n:
ValueError: illegal value in 4-th argument of internal None
Plotting dtype=np.float32
if __name__ == "__main__":
np.random.seed(64)
main(print_matplotlib = True, dtype=np.float32)
np.random.seed(64)
main(print_matplotlib = True, dtype=np.float32)
pass
Output 2
回答2:
As of numpy 1.19.1 and sklearn v0.23.2, I found that polyfit(deg=1) and LinearRegression().fit() gave unexpected errors without any good reason. No, data didn't have any NaN or Inf value. I eventually used scipy.stats.linregress().
slope, intercept, r_value, p_value, std_err = stats.linregress(x.astype(np.float32), y.astype(np.float32))
回答3:
First check for nan,inf values. and also try normalize=True
lreg=LinearRegression(fit_intercept=True, normalize=True, copy_X=True).fit()
But these didn't work for me. Also, my data didn't have any nan or inf values. But while experimenting, I found that running the same code second time works. hence I did this
try:
lreg=LinearRegression(fit_intercept=True, normalize=True, copy_X=True).fit()
except:
lreg=LinearRegression(fit_intercept=True, normalize=True, copy_X=True).fit()
I don't know why this work, but this solved the problem for me. So trying to run the same code twice did the trick for me.
回答4:
You miss plt.show() in your code. Put it after this line:
plt.plot(np.linspace(0, 100, 10), 2. * np.linspace(0, 100, 10) + 3., ls='--', c='red')
plt.show()
回答5:
I would suggest you to use the parameter normalize=True
in your code to avoid this.
LinearRegression(fit_intercept=True,
normalize=True,
copy_X=True,
n_jobs=None)
This resolved the error for me.
回答6:
In your code change it:
lm.fit(x.reshape(-1, 1), y)
on:
lm.fit(x.reshape(-1, 1).astype(np.float32), y)
回答7:
This appears to be caused by a bug in Windows (update 2004?).
- The posted problem https://github.com/scipy/scipy/issues/12893
- is a duplicate of https://github.com/scipy/scipy/issues/12747 and
- is caused by https://github.com/numpy/numpy/issues/16744
It is related to whether Numpy can interface with a particular Basic Linear Algebra Subprograms (BLAS).
The most popular workarounds are to install Numpy using conda
or to use a non-Windows (e.g. GNU/Linux OS). conda
bundles the Intel Math Kernel Library (MKL) which does not have the issue. Non-Windows systems don't have Windows's problems. Supposedly Microsoft will provide a patch sometime around January 2021.
If this issue affects you, as it does many others, please remember that for Numpy, as well as Python and many other Free packages, the license clearly states,
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
Please be mindful of that (i.e. be polite and respectful) in any comments toward the developers of these systems.
回答8:
In scipy/linalg/basic.py, there is line 1031 lstsq function. the argument lapack_driver in lstsq is set to None. line 1162 if driver is None, driver is set to 'gelsd' I think 'gelsd' is the problem. If you change driver = 'gelsy', the code is working well.
来源:https://stackoverflow.com/questions/62561902/valueerror-illegal-value-in-4-th-argument-of-internal-none-when-running-sklearn