ValueError : Input 0 of layer lstm is incompatible with the layer: expected ndim=3, found ndim=2. Full shape received: [None, 18]

问题

I'm new with Keras and I'm trying to build a model for personal use/future learning. I've just started with python and I came up with this code (with help of videos and tutorials). I have a data of 16324 instances, each instance consists of 18 features and 1 dependent variable.

import pandas as pd
import os
import time
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, LSTM, BatchNormalization
from tensorflow.keras.callbacks import TensorBoard, ModelCheckpoint

EPOCHS = 10
BATCH_SIZE = 64
NAME = f"-TEST-{int(time.time())}"

df = pd.read_csv("EntryData.csv", names=['1SH5', '1SHA', '1SA5', '1SAA', '1WH5', '1WHA', '2SA5', '2SAA', '2SH5', '2SHA', '2WA5', '2WAA', '3R1', '3R2', '3R3', '3R4', '3R5', '3R6', 'Target'])

df_val = 14554 

validation_df = df[df.index > df_val]
df = df[df.index <= df_val]

train_x = df.drop(columns=['Target'])
train_y = df[['Target']]
validation_x = validation_df.drop(columns=['Target'])
validation_y = validation_df[['Target']]

model = Sequential()
model.add(LSTM(128, input_shape=(train_x.shape[1:]), return_sequences=True))
model.add(Dropout(0.2))
model.add(BatchNormalization())

model.add(LSTM(128, return_sequences=True))
model.add(Dropout(0.1))
model.add(BatchNormalization())

model.add(LSTM(128))
model.add(Dropout(0.2))
model.add(BatchNormalization())

model.add(Dense(32, activation='relu'))
model.add(Dropout(0.2))

model.add(Dense(2, activation='softmax'))

opt = tf.keras.optimizers.Adam(lr=0.001, decay=1e-6)

model.compile(loss='sparse_categorical_crossentropy',
              optimizer=opt,
              metrics=['accuracy'])

tensorboard = TensorBoard(log_dir=f'logs/{NAME}')

filepath = "RNN_Final-{epoch:02d}-{val_acc:.3f}"  
checkpoint = ModelCheckpoint("models/{}.model".format(filepath, monitor='val_acc', verbose=1, save_best_only=True, mode='max')) # saves only the best ones

history = model.fit(
    train_x, train_y,
    batch_size=BATCH_SIZE,
    epochs=EPOCHS,
    validation_data=(validation_x, validation_y),
    callbacks=[tensorboard, checkpoint],)

score = model.evaluate(validation_x, validation_y, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])

model.save("models/{}".format(NAME))

In line

model.add(LSTM(128, input_shape=(train_x.shape[1:]), return_sequences=True))

is throwing an error:

ValueError: Input 0 of layer lstm is incompatible with the layer: expected ndim=3, found ndim=2. Full shape received: [None, 18]

I was searching for solution on this site and on google for few hours now and I was not able to find proper answer for this or I was not able to implement the solution for similar problem.

Thank you for any tips.

回答1:

An LSTM network expects three dimensional input of this format:

(n_samples, time_steps, features)

There are two main ways this can be a problem.

Your input is 2D
You have stacked (multiple) LSTM layers

1. Your input is 2D

You need to turn your input to 3D.

x = x.reshape(len(x), 1, x.shape[1])
# or
x = np.expand_dims(x, 1)

Then, specify the right input shape in the first layer:

LSTM(64, input_shape=(x.shape[1:]))

2. You have stacked LSTM layers

By default, LSTM layers will not return sequences, i.e., they will return 2D output. This means that the second LSTM layer will not have the 3D input it needs. To address this, you need to set the return_sequences=True:

tf.keras.layers.LSTM(8, return_sequences=True),
tf.keras.layers.LSTM(8)

Here's how to reproduce and solve the 2D input problem:

import tensorflow as tf
import numpy as np

x = np.random.rand(100, 10)
# x = np.expand_dims(x, 1) # uncomment to solve the problem
y = np.random.randint(0, 2, 100)

model = tf.keras.Sequential([
    tf.keras.layers.LSTM(8),
    tf.keras.layers.Dense(1, activation='sigmoid')
])

model.compile(optimizer='adam',
              loss='binary_crossentropy',
              metrics=['accuracy'])

history = model.fit(x, y, validation_split=0.1)

Here's how to reproduce and solve the stacked LSTM layers problem:

import tensorflow as tf
import numpy as np

x = np.random.rand(100, 1, 10)
y = np.random.randint(0, 2, 100)

model = tf.keras.Sequential([
    tf.keras.layers.LSTM(8), # use return_sequences=True to solve the problem
    tf.keras.layers.LSTM(8),
    tf.keras.layers.Dense(1, activation='sigmoid')
])

model.compile(optimizer='adam',
              loss='binary_crossentropy',
              metrics=['accuracy'])

history = model.fit(x, y, validation_split=0.1)

来源：https://stackoverflow.com/questions/58119320/valueerror-input-0-of-layer-lstm-is-incompatible-with-the-layer-expected-ndim

标签

python

tensorflow

keras

lstm