I am using a lstm
on time series data. I have features about the time series that are not time dependent. Imagine company stocks for the series and stuff like c
The first problem is that an LSTM(8)
layer expects two initial states h_0
and c_0
, each of dimension (None, 8)
. That's what it means by "cell.state_size
is (8, 8)" in the error message.
If you only have one initial state dense_2
, maybe you can switch to GRU
(which requires only h_0
). Or, you can transform your feature_input
into two initial states.
The second problem is that h_0
and c_0
are of shape (batch_size, 8)
, but your dense_2
is of shape (batch_size, timesteps, 8)
. You need to deal with the time dimension before using dense_2
as initial states.
So maybe you can change your input shape into (data.training_features.shape[1],)
or take average over timesteps with GlobalAveragePooling1D
.
A working example would be:
feature_input = Input(shape=(5,))
dense_1_h = Dense(4, activation='relu')(feature_input)
dense_2_h = Dense(8, activation='relu')(dense_1_h)
dense_1_c = Dense(4, activation='relu')(feature_input)
dense_2_c = Dense(8, activation='relu')(dense_1_c)
series_input = Input(shape=(None, 5))
lstm = LSTM(8)(series_input, initial_state=[dense_2_h, dense_2_c])
out = Dense(1, activation="sigmoid")(lstm)
model = Model(inputs=[feature_input,series_input], outputs=out)
model.compile(loss='mean_squared_error', optimizer='adam', metrics=["mape"])