I am trying to code and learn different neural network models. I am having a lot of complication with input dimensionality. I am looking for some tutorial that shows differences
Keras documentation shows you all the input_shape
s expected by each layer.
In Keras, you'll see input shapes in these forms:
input_shape
defined by user in layersInput shape defined by user and shapes passed to Reshape
layers:
The defined input shape will ignore the batch size, it will require only the size of an individual sample of data.
For instance, when you define a Dense
layer, you state its input_shape
as (10,)
, meaning it's expecting ten numeric values as input.
Shapes shown by keras in general:
These will have a None
as the first dimension. This symbolizes the size of the batch you use for training. A batch is an array with many data samples. Since the batch size is free and only defined when you actually pass a batch for training, Keras shows it as None
.
In that Dense layer, Keras would show (None, 10)
as input shape.
Array shapes:
During training, predicting, etc, when you actually have a batch and thus its size, keras will show in error messages the actual shape of the batch, which will be (BatchSize,...other dimensions...).
For our Dense(10)
, suppose you passed a batch with 300 samples for training, then keras would show error messages containing shape (300,10)
Tensor shapes:
Tensor shapes will appear in more advanced usage, when you're creating Lambda
layers, or custom layers, and when creating custom loss functions.
The tensor shapes will follow the idea of having the batch size as the first dimension. So, remember that whenever you're working directly with the tensors, they will have shape (BatchSize, ...other dimensions...)
.
Now you know those differences, I'll keep using the form with None
below. Remember to ignore the None part when defining input_shape
.
The dense layer takes usually single values, not arrays as inputs, thus, its input shape is: (None, AnyNumberOfInputValues)
The output will be similar: (None, NumberOfNeuronsInThisLayer)
A convolutional layer with only one dimension. When using convolutions, the idea is to have channels.
Imagine a sound file, it contains two channels, left and right. Each channel is an array of values corresponding to the wave form.
Keras offers you the option of having channels_first
or channels_last
, this changes the input and output of the layers:
(None, channels, length)
(None, length, channels)
The output also follows the "channels_first/channels_last" setting, and it goes like this: (None, NumberOfFilters, ResultingLength)
- with channels first
The default setting is "channels_last", and you can define it in each layer or in the keras.json file
The same idea as Conv1D, using channels, either with channels first or channels last. But now, it has two dimensions, like images, being each channel compared to colors Red, Green, Blue.
(None, channels, pixelsX, pixelsY)
(None, pixelsX, pixelsY, channels)
The output, following the "channels_firts/channels_last" setting, goes like (None, NumberOfFilters, resultPixelsX, resultPixelsY)
for channels_first.
Lstm layers are hard to understand. They have some options that use different input shapes, and there are some interesting tutorials about that, such as this one and this too.
Normally, input shapes are (None, TimeSteps, InputDimension)
Since LSTM is for sequences, you can separate the sequences in time steps. The input dimension is just a possibility of having multidimensional values.
The output also varies depending on the chosen options. It may be (None, Timesteps, NumberOfCellsInTheLeyr)
or just (None, NumberOfCells)
, depending on whether you choose to return sequences or not.