What's the difference between a bidirectional LSTM and an LSTM?

后端 未结 5 828
闹比i
闹比i 2021-01-29 20:10

Can someone please explain this? I know bidirectional LSTMs have a forward and backward pass but what is the advantage of this over a unidirectional LSTM?

What is each o

5条回答
  •  悲&欢浪女
    2021-01-29 20:48

    In comparison to LSTM, BLSTM or BiLSTM has two networks, one access pastinformation in forward direction and another access future in the reverse direction. wiki

    A new class Bidirectional is added as per official doc here: https://www.tensorflow.org/api_docs/python/tf/keras/layers/Bidirectional

    model = Sequential()
    model.add(Bidirectional(LSTM(10, return_sequences=True), input_shape=(5,
    10)))
    

    and activation function can be added like this:

    model = Sequential()
    model.add(Bidirectional(LSTM(num_channels, 
            implementation = 2, recurrent_activation = 'sigmoid'),
            input_shape=(input_length, input_dim)))
    

    Complete example using IMDB data will be like this.The result after 4 epoch.

    Downloading data from https://s3.amazonaws.com/text-datasets/imdb.npz
    17465344/17464789 [==============================] - 4s 0us/step
    Train...
    Train on 25000 samples, validate on 25000 samples
    Epoch 1/4
    25000/25000 [==============================] - 78s 3ms/step - loss: 0.4219 - acc: 0.8033 - val_loss: 0.2992 - val_acc: 0.8732
    Epoch 2/4
    25000/25000 [==============================] - 82s 3ms/step - loss: 0.2315 - acc: 0.9106 - val_loss: 0.3183 - val_acc: 0.8664
    Epoch 3/4
    25000/25000 [==============================] - 91s 4ms/step - loss: 0.1802 - acc: 0.9338 - val_loss: 0.3645 - val_acc: 0.8568
    Epoch 4/4
    25000/25000 [==============================] - 92s 4ms/step - loss: 0.1398 - acc: 0.9509 - val_loss: 0.3562 - val_acc: 0.8606
    

    BiLSTM or BLSTM

    import numpy as np
    from keras.preprocessing import sequence
    from keras.models import Sequential
    from keras.layers import Dense, Dropout, Embedding, LSTM, Bidirectional
    from keras.datasets import imdb
    
    
    n_unique_words = 10000 # cut texts after this number of words
    maxlen = 200
    batch_size = 128
    
    (x_train, y_train), (x_test, y_test) = imdb.load_data(num_words=n_unique_words)
    x_train = sequence.pad_sequences(x_train, maxlen=maxlen)
    x_test = sequence.pad_sequences(x_test, maxlen=maxlen)
    y_train = np.array(y_train)
    y_test = np.array(y_test)
    
    model = Sequential()
    model.add(Embedding(n_unique_words, 128, input_length=maxlen))
    model.add(Bidirectional(LSTM(64)))
    model.add(Dropout(0.5))
    model.add(Dense(1, activation='sigmoid'))
    
    model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
    
    print('Train...')
    model.fit(x_train, y_train,
              batch_size=batch_size,
              epochs=4,
              validation_data=[x_test, y_test])
    

提交回复
热议问题