Using noise_shape of the Dropout layer. Batch_size does not fit into provided samples. What to do?

问题

I am using a dropout layer in my model. As I use temporal data, I want the noise_shape to be the same per timestep -> (batch_size, 1, features).

The problem is if I use a batch size that does not fit into the provided samples, I get an error message. Example: batch_size= 2, samples= 7. In the last iteration, the batch_size (2) is larger than the rest of the samples (1)

The other layers (my case: Masking, Dense, and LSTM) apparently don`t have a problem with that and just use a smaller batch for the last, not fitting, samples.

ConcreteError: Training data shape is:[23, 300, 34] batchsize=3

InvalidArgumentError (see above for traceback): Incompatible shapes: [2,300,34] vs. [3,1,34] [[Node: dropout_18/cond/dropout/mul = Mul[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"](dropout_18/cond/dropout/div, dropout_18/cond/dropout/Floor)]]

Meaning that for the last batch [2,300,34], the batch_size cannot split up into [3,1,34]

As I am still in the parameter tuning phase (does that ever stop :-) ),

Lookback(using LSTMs),
split-percentage of train/val/test,
and batchsize

will still constantly change. All of the mentioned influence the actual length and shape of the Training data.

I could try to always find the next fitting int for batch_size by some calculations. Example, if batch_size=4 and samples=21, I could reduce batch_size to 3. But if the number of training samples are e.g. primes this again would not work. Also If I choose 4, I probably would like to have 4.

Do I think to complex? Is there a simple solution without a lot of exception programming?

Thank you

回答1:

Thanks to nuric in this post, the answer is quite simple.

The current implementation does adjust the according to the runtime batch size. From the Dropout layer implementation code:
 symbolic_shape = K.shape(inputs) noise_shape = [symbolic_shape[axis]
 if shape is None else shape
                for axis, shape in enumerate(self.noise_shape)]
So if you give noise_shape=(None, 1, features) the shape will be (runtime_batchsize, 1, features) following the code above.

来源：https://stackoverflow.com/questions/50858265/using-noise-shape-of-the-dropout-layer-batch-size-does-not-fit-into-provided-sa

标签

python

tensorflow

keras

dropout