问题
I am working with Keras 2.0.0 and I'd like to train a deep model with a huge amount of parameters on a GPU.
As my data are big, I have to use the ImageDataGenerator
. To be honest, I want to abuse the ImageDataGenerator
in that sense, that I don't want to perform any augmentations. I just want to put my training images into batches (and rescale them), so I can feed them to model.fit_generator
.
I adapted the code from here and did some small changes according to my data (i.e. changing binary classification to categorical. But this doesn't matter for this problem which should be discussed here).
I have 15000 train images and the only 'augmentation' I want to perform, is rescaling to scope [0,1] by train_datagen = ImageDataGenerator(rescale=1./255)
.
After creating my 'train_generator' :
train_generator = train_datagen.flow_from_directory(
train_data_dir,
target_size=(img_width, img_height),
batch_size=batch_size,
class_mode='categorical',
shuffle = True,
seed = 1337,
save_to_dir = save_data_dir)
I fit the model by using model.fit_generator()
.
I set amount of epochs to: epochs = 1
And batch_size to: batch_size = 60
What I expect to see in the directory where my augmented (i.e. resized) images are stored: 15.000 rescaled images per epoch, i.e. with only one epoch: 15.000 rescaled images. But, mysteriously, there are 15.250 images.
Is there a reason for this amount of images? Do I have the power to control the amount of augmented images?
Similar problems:
Model fit_generator not pulling data samples as expected (respectively at stackoverflow: Keras - How are batches and epochs used in fit_generator()?)
A concrete example for using data generator for large datasets such as ImageNet
I appreciate your help.
回答1:
If your requirement is to flow the data, while training, the following link will be useful for you, where author has written and explained the script for imageDataGenerator very nicely. You can add further functionality to it like rescaling and others with full control on the data-generation.
https://stanford.edu/~shervine/blog/keras-how-to-generate-data-on-the-fly
来源:https://stackoverflow.com/questions/43604998/how-to-determine-amount-of-augmented-images-in-keras