问题
I am using a dropout layer in my model. As I use temporal data, I want the noise_shape to be the same per timestep -> (batch_size, 1, features).
The problem is if I use a batch size that does not fit into the provided samples, I get an error message. Example: batch_size= 2, samples= 7. In the last iteration, the batch_size (2) is larger than the rest of the samples (1)
The other layers (my case: Masking, Dense, and LSTM) apparently don`t have a problem with that and just use a smaller batch for the last, not fitting, samples.
ConcreteError: Training data shape is:[23, 300, 34] batchsize=3
InvalidArgumentError (see above for traceback): Incompatible shapes: [2,300,34] vs. [3,1,34] [[Node: dropout_18/cond/dropout/mul = Mul[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"](dropout_18/cond/dropout/div, dropout_18/cond/dropout/Floor)]]
Meaning that for the last batch [2,300,34], the batch_size cannot split up into [3,1,34]
As I am still in the parameter tuning phase (does that ever stop :-) ),
- Lookback(using LSTMs),
- split-percentage of train/val/test,
- and batchsize
will still constantly change. All of the mentioned influence the actual length and shape of the Training data.
I could try to always find the next fitting int for batch_size by some calculations. Example, if batch_size=4 and samples=21, I could reduce batch_size to 3. But if the number of training samples are e.g. primes this again would not work. Also If I choose 4, I probably would like to have 4.
Do I think to complex? Is there a simple solution without a lot of exception programming?
Thank you
回答1:
Thanks to nuric in this post, the answer is quite simple.
The current implementation does adjust the according to the runtime batch size. From the Dropout layer implementation code:
symbolic_shape = K.shape(inputs) noise_shape = [symbolic_shape[axis] if shape is None else shape for axis, shape in enumerate(self.noise_shape)]
So if you give
noise_shape=(None, 1, features)
the shape will be (runtime_batchsize, 1, features) following the code above.
来源:https://stackoverflow.com/questions/50858265/using-noise-shape-of-the-dropout-layer-batch-size-does-not-fit-into-provided-sa