lstm

Taking the last state from BiLSTM (BiGRU) in PyTorch

我的未来我决定 提交于 2021-02-07 07:52:49
问题 After reading several articles, I am still quite confused about correctness of my implementation of getting last hidden states from BiLSTM. Understanding Bidirectional RNN in PyTorch (TowardsDataScience) PackedSequence for seq2seq model (PyTorch forums) What's the difference between “hidden” and “output” in PyTorch LSTM? (StackOverflow) Select tensor in a batch of sequences (Pytorch formums) The approach from the last source (4) seems to be the cleanest for me, but I am still uncertain if I

Recurrent NNs: what's the point of parameter sharing? Doesn't padding do the trick anyway?

久未见 提交于 2021-02-07 06:54:32
问题 The following is how I understand the point of parameter sharing in RNNs: In regular feed-forward neural networks, every input unit is assigned an individual parameter, which means that the number of input units (features) corresponds to the number of parameters to learn. In processing e.g. image data, the number of input units is the same over all training examples (usually constant pixel size * pixel size * rgb frames). However, sequential input data like sentences can come in highly

LSTM Autoencoder problems

守給你的承諾、 提交于 2021-02-06 16:14:30
问题 TLDR: Autoencoder underfits timeseries reconstruction and just predicts average value. Question Set-up: Here is a summary of my attempt at a sequence-to-sequence autoencoder. This image was taken from this paper: https://arxiv.org/pdf/1607.00148.pdf Encoder: Standard LSTM layer. Input sequence is encoded in the final hidden state. Decoder: LSTM Cell (I think!). Reconstruct the sequence one element at a time, starting with the last element x[N] . Decoder algorithm is as follows for a sequence

LSTM Autoencoder problems

拜拜、爱过 提交于 2021-02-06 16:07:07
问题 TLDR: Autoencoder underfits timeseries reconstruction and just predicts average value. Question Set-up: Here is a summary of my attempt at a sequence-to-sequence autoencoder. This image was taken from this paper: https://arxiv.org/pdf/1607.00148.pdf Encoder: Standard LSTM layer. Input sequence is encoded in the final hidden state. Decoder: LSTM Cell (I think!). Reconstruct the sequence one element at a time, starting with the last element x[N] . Decoder algorithm is as follows for a sequence

LSTM Autoencoder problems

≯℡__Kan透↙ 提交于 2021-02-06 16:06:04
问题 TLDR: Autoencoder underfits timeseries reconstruction and just predicts average value. Question Set-up: Here is a summary of my attempt at a sequence-to-sequence autoencoder. This image was taken from this paper: https://arxiv.org/pdf/1607.00148.pdf Encoder: Standard LSTM layer. Input sequence is encoded in the final hidden state. Decoder: LSTM Cell (I think!). Reconstruct the sequence one element at a time, starting with the last element x[N] . Decoder algorithm is as follows for a sequence

LSTM Autoencoder problems

旧巷老猫 提交于 2021-02-06 16:05:10
问题 TLDR: Autoencoder underfits timeseries reconstruction and just predicts average value. Question Set-up: Here is a summary of my attempt at a sequence-to-sequence autoencoder. This image was taken from this paper: https://arxiv.org/pdf/1607.00148.pdf Encoder: Standard LSTM layer. Input sequence is encoded in the final hidden state. Decoder: LSTM Cell (I think!). Reconstruct the sequence one element at a time, starting with the last element x[N] . Decoder algorithm is as follows for a sequence

How to restore punctuation using Python? [closed]

半腔热情 提交于 2021-02-04 21:57:26
问题 Closed . This question needs to be more focused. It is not currently accepting answers. Want to improve this question? Update the question so it focuses on one problem only by editing this post. Closed last year . Improve this question I would like to restore commas and full stops in text without punctuation. For example, let's take this sentence: I am XYZ I want to execute I have a doubt And I would like to detect that there should be 1 commas and 1 full stop in the above example: I am XYZ,

How can i add a Bi-LSTM layer on top of bert model?

大憨熊 提交于 2021-01-29 15:22:30
问题 I'm using pytorch and I'm using the base pretrained bert to classify sentences for hate speech. I want to implement a Bi-LSTM layer that takes as an input all outputs of the latest transformer encoder from the bert model as a new model (class that implements nn.Module ), and i got confused with the nn.LSTM parameters. I tokenized the data using bert = BertForSequenceClassification.from_pretrained("bert-base-uncased", num_labels=int(data['class'].nunique()),output_attentions=False,output

Error when checking target: expected dense_1 to have shape (257, 257) but got array with shape (257, 1)

*爱你&永不变心* 提交于 2021-01-29 14:10:33
问题 print(X.shape,Y.shape) #(5877, 257, 1) (5877, 257, 1) model = Sequential() model.add(LSTM(257, input_shape=(257,1),stateful=False,return_sequences=True)) model.add(Dense(257, activation='sigmoid')) model.compile(loss=losses.mean_squared_error, optimizer='adam', metrics=['accuracy']) model.fit(x=X,y=Y,epochs=100,shuffle=False) Error when checking target: expected dense_1 to have shape (257, 257) but got array with shape (257, 1) I should give 5877 frames of size 257 to lstm layer. The output

How to use many-to-one LSTM with variable-length input on Keras?

て烟熏妆下的殇ゞ 提交于 2021-01-29 09:33:58
问题 I have a multi-class sequence labeling problem where the number of time steps varies within samples. To use LSTM with variable-length input, I applied zero padding and masking to my input. I've read here that propagation of the mask stops after using LSTM layer with return_sequence=False parameter, that part confused me. My question is, would it be okay to use LSTM with return_sequence=False to calculate loss correctly for the below architecture ? from tensorflow.keras.layers import LSTM,