问题
I'm trying to run this convolutional auto encoder sample but with my own data, so I modified its InputLayer accoridng to my images. However, on the output layer there is a problem with dimensions. I'm sure the problem is with UpSampling, but I'm not sure why is this happening: here goes the code.
N, H, W = X_train.shape
input_img = Input(shape=(H,W,1)) # adapt this if using `channels_first` image data format
x = Conv2D(16, (3, 3), activation='relu', padding='same')(input_img)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
encoded = MaxPooling2D((2, 2), padding='same')(x)
# at this point the representation is (4, 4, 8) i.e. 128-dimensional
x = Conv2D(8, (3, 3), activation='relu', padding='same')(encoded)
x = UpSampling2D((2, 2))(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
x = Conv2D(16, (3, 3), activation='relu')(x)
x = UpSampling2D((2, 2))(x)
decoded = Conv2D(1, (3, 3), activation='sigmoid', padding='same')(x)
autoencoder = Model(input_img, decoded)
autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')
autoencoder.summary()
Then, When I run fit, throws this error:
i+=1
autoencoder.fit(x_train, x_train,
epochs=50,
batch_size=128,
shuffle=True,
validation_data=(x_test, x_test),
callbacks= [TensorBoard(log_dir='/tmp/autoencoder/{}'.format(i))])
ValueError: Error when checking target: expected conv2d_23 to have shape (148, 84, 1) but got array with shape (150, 81, 1)
I went back to the tutorial code, and try to see its model's summary, and it shows the following:
I'm sure there is a problem while reconstructing the output on decoder, But I'm not sure why is it, why does it work for 128x28 images but not for mines of 150x81
I guess I can solve this changing a little my image's dimencions, but I'd like to understand what is happening and how can I avoid it
回答1:
You can use ZeroPadding2D padding input image to 32X32, then use Cropping2D cropping decoded image.
from keras.layers import ZeroPadding2D, Cropping2D
input_img = Input(shape=(28,28,1)) # adapt this if using `channels_first` image data format
input_img_padding = ZeroPadding2D((2,2))(input_img) #zero padding image to shape 32X32
x = Conv2D(16, (3, 3), activation='relu', padding='same')(input_img_padding)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
encoded = MaxPooling2D((2, 2), padding='same')(x)
# at this point the representation is (4, 4, 8) i.e. 128-dimensional
x = Conv2D(8, (3, 3), activation='relu', padding='same')(encoded)
x = UpSampling2D((2, 2))(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
x = Conv2D(16, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
decoded = Conv2D(1, (3, 3), activation='sigmoid', padding='same')(x)
decoded_cropping = Cropping2D((2,2))(decoded)
autoencoder = Model(input_img, decoded_cropping) #cropping image from 32X32 to 28X28
autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')
autoencoder.summary()
回答2:
The last layer of the decoder does not use any padding. You can add this by changing the last layer in your decoder to:
x = Conv2D(16, (3, 3), activation='relu', padding='same')(x)
You will see that the output dim will now match the input dim.
来源:https://stackoverflow.com/questions/50515409/keras-shapes-while-upsampling-mismatch