resampling data - using SMOTE from imblearn with 3D numpy arrays

问题

I want to resample my dataset. This consists in categorical transformed data with labels of 3 classes. The amount of samples per class are:

counts of class A: 6945
counts of class B: 650
counts of class C: 9066
TOTAl samples: 16661

The data shape without labels is (16661, 1000, 256). This means 16661 samples of (1000,256). What I would like is to up-sampling the data up to the number of samples from the majority class, that is, class A -> (6945)

However, when calling:

from imblearn.over_sampling import SMOTE
print(categorical_vector.shape)
sm = SMOTE(random_state=2)
X_train_res, y_labels_res = sm.fit_sample(categorical_vector, labels.ravel())

It keeps saying ValueError: Found array with dim 3. Estimator expected <= 2.

How can I flatten the data in a way that the estimator could fit it and that it makes sense too? Furthermore, how can I unflatten (with 3D dimension) after getting X_train_res?

回答1:

I am considering a dummy 3d array and assuming a 2d array size by myself,

arr = np.random.rand(160, 10, 25)
orig_shape = arr.shape
print(orig_shape)

Output: (160, 10, 25)

arr = np.reshape(arr, (arr.shape[0], arr.shape[1]))
print(arr.shape)

Output: (4000, 10)

arr = np.reshape(arr, orig_shape))
print(arr.shape)

Output: (160, 10, 25)

来源：https://stackoverflow.com/questions/56125380/resampling-data-using-smote-from-imblearn-with-3d-numpy-arrays

标签

python

numpy

imblearn

易学教程内所有资源均来自网络或用户发布的内容，如有违反法律规定的内容欢迎反馈！
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!