问题
I'm trying to replicate the following batch generator for a project. However I'm having issues reshaping my data. The goal of the function is to take an array of [6000,3000] and reshape it to shape [batch_size, 100,3000,1].
Functioning code I'm trying to replicate
def gen(dict_files, aug=False):
while True:
record_name = random.choice(list(dict_files.keys()))
batch_data = dict_files[record_name]
all_rows = batch_data['x']
for i in range(batch_size):
start_index = random.choice(range(all_rows.shape[0]-WINDOW_SIZE))
X = all_rows[start_index:start_index+WINDOW_SIZE, ...]
Y = batch_data['y'][start_index:start_index+WINDOW_SIZE]
X = np.expand_dims(X, 0)
Y = np.expand_dims(Y, -1)
Y = np.expand_dims(Y, 0)
yield X, Y
gen outputs X, Y:
X.shape=(batch_size, 100, 3000, 1)
Y.shape=(batch_size, 100, 1)
My code:
Parameter definitions:
Features = array[6000,3000] & Labels = array[6000,]
def generator(features, labels, batch_size):
##Define batch shapes
X_train_batch = np.zeros((batch_size,100, 3000, 1))
y_train_batch = np.zeros((batch_size,100, 1))
while True:
sample_index = random.choice(range(features.shape[0]))
sample_data = features[sample_index]
##Generating training batches
for i in range(batch_size):
start_index = random.choice(range(sample_data.shape[0]-100)) #pick random start point in signal (of length 3000timesteps)
X = sample_data[start_index:start_index + 100, ...] #record 100 timesteps in signal from rand start point
Y = labels[start_index:start_index + 100] #Record classification of X
#print(X.shape) #gives arrays of (100,), should be (100,3000)
##reshaping to input shape taken by model
X = np.expand_dims(X, 0)
Y = np.expand_dims(Y, -1)
Y = np.expand_dims(Y, 0)
##Collecting samples into correct batch size
#X_train_batch[i] = X
#y_train_batch[i] = Y
print(y_train_batch.shape) #gives (32,100,1) which is correct!
generator(features, labels, 32)
Can someone explain the function of the ellipsis('...') found hereX = all_rows[start_index:start_index+WINDOW_SIZE, ...]
? From my understanding ellipsis places the 3000 timesteps of sample_data to give the output (100,3000), but I'm apparently not understanding something as I can't get the same behavior in my code.
Following on how could I replicate what gen is outputting with my code?
来源:https://stackoverflow.com/questions/60646209/issues-reshaping-numpy-array-using-ellipsis