How to handle LSTMs with many features in python?

醉酒当歌 提交于 2020-04-16 04:23:19

问题


I have a binary classification problem. I use the following keras model to do my classification.

input1 = Input(shape=(25,6))
x1 = LSTM(200)(input1)
input2 = Input(shape=(24,6))
x2 = LSTM(200)(input2)
input3 = Input(shape=(21,6))
x3 = LSTM(200)(input3)
input4 = Input(shape=(20,6))
x4 = LSTM(200)(input4)
x = concatenate([x1,x2,x3,x4])
x = Dropout(0.2)(x)
x = Dense(200)(x)
x = Dropout(0.2)(x)
output = Dense(1, activation='sigmoid')(x)

However, the results I get is extremely bad. I thought the reason is that I have too many features, thus, needs have more improved layers after the concatenate.
I was also thinking if it would be helpful to used a flatten() layer after the concatenate.
anyway, since I am new to deep learning, I am not so sure how to make this a better model.

I am happy to provide more details if needed.


回答1:


Here is what I can suggest

  1. Remove every things that prevent overfitting, such as Dropout and regularizer. What can happen is that your model may not be able to capture the complexity of your data using given layer, so you need to make sure that your model is able to overfit first before adding regularizer.

  2. Now try increase number of Dense layer and number of neuron in each layer until you can see some improvement. There is also a possibility that your data is too noisy or you have only few data to train the model so you can't even produce a useful predictions.

  3. Now if you are LUCKY and you can see overfitting, you can add Dropout and regularizer.

Because every neural network is a gradient base algorithm, you may end up at local minimum. You may also need to run the algorithm multiple times with different initial weight before you can get a good result or You can change your loss function so that you have a convex problem where local minimum is global minimum.

If you can't achieve better result

You may need to try different topology because LSTM is just trying to model a system that assume to have Markov property. you can look at nested-LSTM or something like that, which model the system in the way that next time step is not just depend on current time step.




回答2:


The Dropout right before the output layer could be problematic. I would suggest removing both Dropout layers and evaluating performance, then re-introduce regularization once the model is performing well on the the training set.



来源:https://stackoverflow.com/questions/60859157/how-to-handle-lstms-with-many-features-in-python

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!