问题
I am making a Stochastic Gradient Descent Classifier (SGDClassifier) using scikit- learn. While Fitting my training data (of shape (60000,784)), I am getting memory error. How to fix it?
I have already tried switching from 32 bit to 64 bit IDE. And reducing the training data will decrease the performance (that is basically not the option).
Code: (Python 3.7)
# Classification Problem
# Date: 1st September 2019
# Author: Pranay Saha
import pandas as pd
x_train= pd.read_csv('mnist_train.csv')
y_train= x_train['label'].copy()
x_train= x_train.drop('label', axis=1)
print(x_train.shape)
print(y_train.shape)
from sklearn.linear_model import SGDClassifier
my_model= SGDClassifier(random_state= 42)
my_model.fit(x_train, y_train)
Expected was to fit my model.
And error message is as follows:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 60000 entries, 0 to 59999
Columns: 784 entries, 1x1 to 28x28
dtypes: int64(784)
memory usage: 358.9 MB
None
Index(['1x1', '1x2', '1x3', '1x4', '1x5', '1x6', '1x7', '1x8', '1x9', '1x10',
...
'28x19', '28x20', '28x21', '28x22', '28x23', '28x24', '28x25', '28x26',
'28x27', '28x28'],
dtype='object', length=784)
(60000, 784)
(60000,)
C:\Users\Pranay\AppData\Local\Programs\Python\Python36-32\Lib\site-packages\sklearn\linear_model\stochastic_gradient.py:128: FutureWarning: max_iter and tol parameters have been added in <class 'sklearn.linear_model.stochastic_gradient.SGDClassifier'> in 0.19. If both are left unset, they default to max_iter=5 and tol=None. If tol is not None, max_iter defaults to max_iter=1000. From 0.21, default max_iter will be 1000, and default tol will be 1e-3.
"and default tol will be 1e-3." % type(self), FutureWarning)
Traceback (most recent call last):
File "g:\Machine Learning Notebooks\MNIST Classification\main.py", line 19, in <module>
if my_model.fit(x_train, y_train):
File "C:\Users\Pranay\AppData\Local\Programs\Python\Python36-32\Lib\site-packages\sklearn\linear_model\stochastic_gradient.py", line 586, in fit
sample_weight=sample_weight)
File "C:\Users\Pranay\AppData\Local\Programs\Python\Python36-32\Lib\site-packages\sklearn\linear_model\stochastic_gradient.py", line 418, in _fit
X, y = check_X_y(X, y, 'csr', dtype=np.float64, order="C")
File "C:\Users\Pranay\AppData\Local\Programs\Python\Python36-32\Lib\site-packages\sklearn\utils\validation.py", line 573, in check_X_y
ensure_min_features, warn_on_dtype, estimator)
File "C:\Users\Pranay\AppData\Local\Programs\Python\Python36-32\Lib\site-packages\sklearn\utils\validation.py", line 433, in check_array
array = np.array(array, dtype=dtype, order=order, copy=copy)
numpy.core._exceptions.MemoryError: Unable to allocate array with shape (60000, 784) and data type float64
来源:https://stackoverflow.com/questions/57748040/how-to-fix-numpy-core-exceptions-memoryerror-while-performing-mnist-digit-cla