Passing output of a CNN to BILSTM

问题

I am working on a project in which I have to pass the output of CNN to Bi directional LSTM. I created the model as below but it is throwing 'incompatible' error. Please let me know where I am going wrong and how to fix this


    model = Sequential()
    model.add(Conv2D(filters = 16, kernel_size = 3,input_shape = (32,32,1)))
    model.add(BatchNormalization())
    model.add(MaxPooling2D(pool_size=(2,2),strides=1, padding='valid'))
    model.add(Activation('relu'))
    
    model.add(Conv2D(filters = 32, kernel_size=3))
    model.add(BatchNormalization())
    model.add(MaxPooling2D(pool_size=(2,2)))
    model.add(Activation('relu'))
    
    model.add(Dropout(0.25))
    model.add(Conv2D(filters = 48, kernel_size=3))
    model.add(BatchNormalization())
    model.add(MaxPooling2D(pool_size=(2,2)))
    model.add(Activation('relu'))
    
    model.add(Dropout(0.25))
    model.add(Conv2D(filters = 64, kernel_size=3))
    model.add(BatchNormalization())
    model.add(Activation('relu'))
    
    model.add(Dropout(0.25))
    model.add(Conv2D(filters = 80, kernel_size=3))
    model.add(BatchNormalization())
    model.add(Activation('relu'))
    
    model.add(Bidirectional(LSTM(150, return_sequences=True)))
    model.add(Dropout(0.3))
    model.add(Bidirectional(LSTM(96)))
    model.add(Dense(total_words/2, activation='relu', kernel_regularizer=regularizers.l2(0.01)))
    model.add(Dense(total_words, activation='softmax'))
    
    model.summary()

The error returned is:


    ValueError                                Traceback (most recent call last)
    <ipython-input-24-261befed7006> in <module>()
         27 model.add(Activation('relu'))
         28 
    ---> 29 model.add(Bidirectional(LSTM(150, return_sequences=True)))
         30 model.add(Dropout(0.3))
         31 model.add(Bidirectional(LSTM(96)))
    
    5 frames
    /usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/input_spec.py in assert_input_compatibility(input_spec, inputs, layer_name)
        178                          'expected ndim=' + str(spec.ndim) + ', found ndim=' +
        179                          str(ndim) + '. Full shape received: ' +
    --> 180                          str(x.shape.as_list()))
        181     if spec.max_ndim is not None:
        182       ndim = x.shape.ndims
    
    ValueError: Input 0 of layer bidirectional is incompatible with the layer: expected ndim=3, found ndim=4. Full shape received: [None, 1, 1, 80]

回答1:

the problem is the data passed to LSTM and it can be solved inside your network. The LSTM expects 3D data. There are two possibilities you can adopt: 1) make a reshape (batch_size, H, W*channel); 2) (batch_size, W, H*channel). In this way, u have 3D data to use inside your LSTM. below an example

def ReshapeLayer(x):
    
    shape = x.shape
    
    # 1 possibility: H,W*channel
    reshape = Reshape((shape[1],shape[2]*shape[3]))(x)
    
    # 2 possibility: W,H*channel
    # transpose = Permute((2,1,3))(x)
    # reshape = Reshape((shape[1],shape[2]*shape[3]))(transpose)
    
    return reshape

total_words = 300
model = Sequential()
model.add(Conv2D(filters = 16, kernel_size = 3,input_shape = (32,32,1)))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2,2),strides=1, padding='valid'))
model.add(Activation('relu'))

model.add(Conv2D(filters = 32, kernel_size=3))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Activation('relu'))

model.add(Dropout(0.25))
model.add(Conv2D(filters = 48, kernel_size=3))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Activation('relu'))

model.add(Dropout(0.25))
model.add(Conv2D(filters = 64, kernel_size=3))
model.add(BatchNormalization())
model.add(Activation('relu'))

model.add(Dropout(0.25))
model.add(Conv2D(filters = 80, kernel_size=3))
model.add(BatchNormalization())
model.add(Activation('relu'))

model.add(Lambda(ReshapeLayer)) # <============

model.add(Bidirectional(LSTM(150, return_sequences=True)))
model.add(Dropout(0.3))
model.add(Bidirectional(LSTM(96)))
model.add(Dense(total_words/2, activation='relu'))
model.add(Dense(total_words, activation='softmax'))

model.summary()

回答2:

Conv2D has 2-dimensional input/output but an LSTM takes 1-dimensional input. This is why it is expecting 3 dimensions (Batch, Sequence, Hid) but finding 4 (Batch, X, Y, Hid). The solution is to e.g. use the Flatten layer after the CNN and before the LSTM to project the output to a 1-dimensional sequence.

来源：https://stackoverflow.com/questions/63789810/passing-output-of-a-cnn-to-bilstm

标签

python

tensorflow

keras

lstm

CNN