问题
I have extension(example .exe,.py,.xml,.doc etc) table in my dataframe. after running on terminal I am getting above error on large data set.
encoder = OneHotEncoder(handle_unknown='ignore')
encoder.fit(features['Extension'].values.reshape(-1, 1))
temp = encoder.transform(features['Extension'].values.reshape(-1, 1)).toarray() #GETTING ERROR on this
print("Size of array in bytes",getsizeof(temp))
print("Array :-",temp)
print("Shape :- ",features.shape, temp.shape)
features.drop(columns=['Extension'], axis=1, inplace=True)
dump(encoder, os.path.join(os.getcwd(), 'model_dumps', 'encoder.pkl'))
features.drop(columns=['Extension'], axis=1, inplace=True)
features = featureScaling(features)
features = np.concatenate((features, temp), axis=1)
OUTPUT -
1) Size of array in bytes :- 8884558912
2) Array :-
[[0. 0. 0. ... 0. 0. 0.]
[0. 0. 0. ... 0. 0. 0.]
[0. 0. 0. ... 0. 0. 0.]
...
[1. 0. 0. ... 0. 0. 0.]
[1. 0. 0. ... 0. 0. 0.]
[1. 0. 0. ... 0. 0. 0.]]
3)Shape :- (323310, 8) (323310, 3435)
回答1:
that's funny.
MemoryError: Unable to allocate 8.27 GiB for an array with shape (323313, 3435) and data type float64
most modern computers don't have more than 8 Gb of RAM. Looks like you have 8 and python is not able to fit all this data in the memory. Try buying another computer with more ram or upgrade your existing one. This will definitely fix the issue.
来源:https://stackoverflow.com/questions/64964570/memoryerror-unable-to-allocate-8-27-gib-for-an-array-with-shape-323313-3435