I am using Keras with a Tensorflow backend in Python. To be more precise tensorflow 1.2.1 and its build-in contrib.keras lib.
I want to use the
The validation generator works exactly like the training generator. You define how many batches it will wield per epoch.
steps_per_epoch
batches. validation_steps
batches. But validation data has absolutely no relation to training data. There is no need to separate validation batches according to training batches (I would even say that there is no point in doing that, unless you have a very specific intention). Also, the total number of samples in training data is not related to the total number of samples in test data.
The point of having many batches is just to spare your computer's memory, so you test smaller packs one at a time. Probably, you find a batch size that will fit your memory or expected training time and use that size.
That said, Keras gives you a totally free method, so you can determine the training and the validation batches as you wish.
Ideally, you use all your validation data at once. If you use only part of your validation data, you will get different metrics for each batch, what may make you think that your model got worse or better when it actually didn't, you just measured different validation sets.
That's why they suggest validation_steps = total_validation_samples // validation_batch_size
.
Theoretically, you test your entire data every epoch, as you theoretically should also train your entire data every epoch.
So, theorethycally each epoch yields:
steps_per_epoch = TotalTrainingSamples / TrainingBatchSize
validation_steps = TotalvalidationSamples / ValidationBatchSize
Basically, the two vars are: how many batches per epoch you will yield.
This makes sure that at each epoch:
Nevertheless, it's totally up to you how you separate your training and validation data.
If you do want to have one different batch per epoch (epochs using less than your entire data), it's ok, just pass steps_per_epoch=1
or validation_steps=1
, for instance. The generator is not resetted after each epoch, so the second epoch will take the second batch, and so on, until it loops again to the first batch.
I prefer training the entire data per epoch, and if the time is too long, I use a callback
that shows the logs at the end of each batch:
from keras.callbacks import LambdaCallback
callbacks = callbacks=[LambdaCallback(on_batch_end=lambda batch,logs:print(logs))]
I was never able to use use_multiprocessing=True
, it freezes at the start of the first epoch.
I've noticed the workers
are related to how many batches are preloaded from the generator. If you define max_queue_size=1
, you will have exactly workers
amount of batches preloaded.
They suggest you use keras Sequences when multiprocessing. The sequences work pretty much as a generator, but it keeps track of the order/position of each batch.