问题
I am doing data augmentation using
data_gen=image.ImageDataGenerator(rotation_range=20,width_shift_range=0.2,height_shift_range=0.2,
zoom_range=0.15,horizontal_flip=False)
iter=data_gen.flow(X_train,Y_train,batch_size=64)
data_gen.flow()
needs a rank 4 data matrix, so the shape of X_train
is(60000, 28, 28, 1)
. We need to pass the same shape i.e (60000, 28, 28, 1)
while defining the architecture of the model as follows;
model=Sequential()
model.add(Dense(units=64,activation='relu',kernel_initializer='he_normal',input_shape=(28,28,1)))
model.add(Flatten())
model.add(Dense(units=10,activation='relu',kernel_initializer='he_normal'))
model.summary()
model.add(Flatten())
was used to handle the rank-2 problem. Now the problem is with model.summary()
. It is giving incorrect output as shown below;
Model: "sequential_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_1 (Dense) (None, 28, 28, 64) 128
_________________________________________________________________
flatten_1 (Flatten) (None, 50176) 0
_________________________________________________________________
dense_2 (Dense) (None, 10) 501770
=================================================================
Total params: 501,898
Trainable params: 501,898
Non-trainable params: 0
The Output Shape
for dense_1 (Dense)
should be (None,64)
and Param #
should be (28*28*64)+64
i.e 50240
. The Output Shape
for dense_2 (Dense)
is correct but the Param #
should be (64*10)+10
i.e 650
.
Why is this happening and how can this problem be addressed?
回答1:
The summary is not incorrect. The keras Dense
layer always works on the last dimension of the input.
ref: https://www.tensorflow.org/api_docs/python/tf/keras/layers/Dense
Input shape:
N-D tensor with shape: (batch_size, ..., input_dim). The most common situation would > be a 2D input with shape (batch_size, input_dim). Output shape:
N-D tensor with shape: (batch_size, ..., units). For instance, for a 2D input with shape (batch_size, input_dim), the output would have shape (batch_size, units).
Before each Dense layer, you need to manually apply Flatten()
to make sure you're passing 2-d data.
One work-around for your desired output_shape is:
model=Sequential()
model.add(Dense(units=1,activation='linear', use_bias = False, trainable = False, kernel_initializer=tf.keras.initializers.Ones(),input_shape=(28,28,1)))
model.add(Flatten())
model.add(Dense(units=64,activation='relu'))
model.add(Dense(units=10,activation='relu',kernel_initializer='he_normal'))
model.summary()
The first layer is just one layer, initialized with ones, with no bias, so it just multiplies the input with one and passes to the next layer to be flattened. This removes unnecessary parameters from the model.
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense (Dense) (None, 28, 28, 1) 2
_________________________________________________________________
flatten (Flatten) (None, 784) 0
_________________________________________________________________
dense_1 (Dense) (None, 64) 50240
_________________________________________________________________
dense_2 (Dense) (None, 10) 650
=================================================================
Total params: 50,892
Trainable params: 50,892
Non-trainable params: 0
来源:https://stackoverflow.com/questions/61554123/keras-model-summary-incorrect