I am training a Convolutional Neural Network using face images dataset. The dataset has 10,000 images of dimensions 700 x 700. My model has 12 layers. I am using a generator
You have to make sure that your data generator shuffles the data between epochs. I would suggest you create a list of possible indices outside of your loop, randomize it with random.shuffle and then iterate over that inside your loop.
Source: https://github.com/keras-team/keras/issues/2389 and own experience.
It is most probably due to the lack of data shuffling in your data generator. I have run into the same problem. I've changed shuffle=True but without success. Then I've integrated a shuffle inside my custom generator. Here is the custom generator suggested by Keras documentation:
class Generator(Sequence):
# Class is a dataset wrapper for better training performance
def __init__(self, x_set, y_set, batch_size=256):
self.x, self.y = x_set, y_set
self.batch_size = batch_size
def __len__(self):
return math.ceil(self.x.shape[0] / self.batch_size)
def __getitem__(self, idx):
batch_x = self.x[idx * self.batch_size:(idx + 1) * self.batch_size]
batch_y = self.y[idx * self.batch_size:(idx + 1) * self.batch_size]
return batch_x, batch_y
Here is it with shuffle inside:
class Generator(Sequence):
# Class is a dataset wrapper for better training performance
def __init__(self, x_set, y_set, batch_size=256):
self.x, self.y = x_set, y_set
self.batch_size = batch_size
self.indices = np.arange(self.x.shape[0])
def __len__(self):
return math.ceil(self.x.shape[0] / self.batch_size)
def __getitem__(self, idx):
inds = self.indices[idx * self.batch_size:(idx + 1) * self.batch_size]
batch_x = self.x[inds]
batch_y = self.y[inds]
return batch_x, batch_y
def on_epoch_end(self):
np.random.shuffle(self.indices)
Then the model converged nicely. Credits for fculinovic.