I want to pass the output of ConvLSTM and Conv2D to a Dense Layer in Keras, what is the difference between using global average pooling and flatten Both is working in my cas
You can test the difference between Flatten and GlobalPooling on your own making comparison with numpy, if you are more confident
We make a demonstration using, as input, a batch of images with this shape (batch_dim, height, width, n_channel)
import numpy as np
from tensorflow.keras.layers import *
batch_dim, H, W, n_channels = 32, 5, 5, 3
X = np.random.uniform(0,1, (batch_dim,H,W,n_channels)).astype('float32')
Flatten
accepts as input tensor of at least 3D. It operates a reshape of the input in 2D with this format (batch_dim, all the rest)
. In our case of 4D, it operates a reshape in this format (batch_dim, H*W*n_channels)
.
np_flatten = X.reshape(batch_dim, -1) # (batch_dim, H*W*n_channels)
tf_flatten = Flatten()(X).numpy() # (batch_dim, H*W*n_channels)
(tf_flatten == np_flatten).all() # True
GlobalAveragePooling2D
accepts as input 4D tensor. It operates the mean on the height and width dimensionalities for all the channels. The resulting dimensionality is 2D (batch_dim, n_channels)
. GlobalMaxPooling2D
makes the same but with max operation.
np_GlobalAvgPool2D = X.mean(axis=(1,2)) # (batch_dim, n_channels)
tf_GlobalAvgPool2D = GlobalAveragePooling2D()(X).numpy() # (batch_dim, n_channels)
(tf_GlobalAvgPool2D == np_GlobalAvgPool2D).all() # True